Skip to content

Commit c2b23a4

Browse files
committed
update to version 0.0.5
1 parent 0c55ba5 commit c2b23a4

11 files changed

+295
-102
lines changed

.coverage

0 Bytes
Binary file not shown.

.coveragerc

-2
Original file line numberDiff line numberDiff line change
@@ -1,8 +1,6 @@
11
# .coveragerc to control coverage.py
22
[run]
33
branch = False
4-
include =
5-
*/src/*
64

75
[report]
86
# Regexes for lines to exclude from consideration

README.md

+134-51
Original file line numberDiff line numberDiff line change
@@ -7,17 +7,16 @@
77
![downloads](https://img.shields.io/pypi/dm/dictpy)
88
![license](https://img.shields.io/github/license/dylanwal/dictpy)
99

10+
Advanced tools for Python dictionaries.
1011

12+
Included Tools:
1113

12-
This Python Package can help with finding data in large complex Python dictionaries. These data structures of typical of
13-
JSON Files.
14-
15-
Additionally, a serialization tool is included to turning custom python classes into JSON compatible dictionaries.
16-
14+
* `DictSearch`: Search large and complex Python dictionaries/JSON files.
15+
* `Serializer`: Make custom JSON serializable Python classes serializable (make safe for conversion to JSON).
1716

1817
## Installation
1918

20-
Pip installable package available.
19+
Pip installable package available.
2120

2221
`pip install dictpy`
2322

@@ -26,96 +25,180 @@ Pip installable package available.
2625

2726
---
2827

29-
## Search Usage
30-
31-
Load in a JSON file.
32-
33-
```python
34-
import json
28+
## Searching (DictSearch)
3529

36-
with open(data_path, "r") as f:
37-
text = f.read()
38-
json_data = json.loads(text)
39-
```
30+
Imagine you have some big ugly Python dictionary (like the one produced by [PubChem](https://pubchem.ncbi.nlm.nih.gov/)
31+
when you download the JSON file
32+
for [CID 6](https://pubchem.ncbi.nlm.nih.gov/rest/pug_view/data/compound/6/JSON/?response_type=display)) and you want to
33+
extract some specific piece of information. This section will show how `DictSearch` can make this easy.
4034

41-
Preform search. It will find all valid objects in the search. Use `.result` to view results.
35+
To perform the search we can pass the Python dictionary, and a search target (more discussion below on this) to
36+
`DictSearch`. It will find **all** valid objects for the search. The results of the search will be stored in `.result`.
4237

4338
```python
4439
import dictpy
4540

46-
search = DictSearch(data=json_data, target=target)
41+
search = dictpy.DictSearch(data=json_data, target=target)
4742
print(search.result)
48-
43+
4944
```
5045

51-
Example return object:
52-
The return object is a `list[list[tree, obj]]`.
46+
The return object is a `list[list[tree, obj]]`
47+
48+
* `tree`: shows the navigation to get to the data ('.' separated)
49+
* Keys are recorded for dictionaries
50+
* Integer are recorded for position in lists
51+
* Example: `Record.Section.1.Description`
52+
```python
53+
{"Record": {
54+
"Section": [
55+
######,
56+
{"Description": #####} # A match to the search!
57+
]
58+
}}
59+
```
60+
* `obj` return the object
61+
* **Options**:
62+
* Return current object (default)
63+
* Returns the object you search for
64+
* Example:
65+
* search: `{"dog": "*"}`; returns: `{"dog": "golden retriever"}`
66+
* search: `"dog"`; returns: `{"dog": "golden retriever"}`
67+
* search: `{"dog": "golden retriever"}`; returns: `{"dog": "golden retriever"}`
68+
* Return parent object
69+
* Returns parent object or whole current level
70+
* To switch to returning parent objects, change `return_func`.
71+
```pyhton
72+
search = dictpy.DictSearch(data=json_data, target=target, return_func=dictpy.DictSearch.return_parent_object)
73+
```
74+
* Example
75+
* search: `{"dog": "*"}`; returns:
76+
```python
77+
{
78+
"dog": "golden retriever",
79+
"cat": "bangel",
80+
"fish": "goldfish"
81+
}
82+
```
83+
* search: `"dog"`; returns:
84+
```python
85+
{
86+
"dog": "golden retriever",
87+
"cat": "bangel",
88+
"fish": "goldfish"
89+
}
90+
```
91+
92+
### How to format `target`
93+
94+
Target can take match accept `strings`, `int`, `floats`, single line `dictionaries`, and `regex` (regular expression).
95+
Wild cards(`*`) can also be used for partial dictionary searches.
96+
97+
Example Targets:
98+
* `{"RecordType": "CID"}`
99+
* Will match exactly to both 'key', and 'value' (won't match to list entries)
100+
* `{"RecordNumber": 6}`
101+
* Will match exactly to both 'key', and 'value' (won't match to list entries)
102+
* With numbers, the default search behavior auto-coverts strings to number.
103+
* So this would hit to {"RecordNumber": "6"}
104+
* To change this behavior set `op_convert_str_to_num=False`
105+
* `2526`
106+
* Will look for 2526 in either 'key', 'value' or list entry.
107+
* `3D Conformer`
108+
* Will look for "3D Conformer" in either 'key', 'value' or list entry.
109+
* `{"MoveToTop": "*"}`
110+
* Will look for "MoveToTop" as a dictionary 'key' and the 'value' can be anything. (won't match to list entries)
111+
* `{"*": "Chemical Safety"}`
112+
* Will look for "Chemical Safety" as a dictionary 'value' and the 'key' can be anything. (won't match to list entries)
113+
* `"^[A-I]{3}$"`
114+
* Regular expression search will match in either 'key', 'value' or list entry.
115+
* `{"^RecordT": "*"}`
116+
* Regular expression search will match for 'key' and 'value' can be anything. (won't match to list entries)
117+
118+
For more examples see
119+
[tests/test_dict_search.py](https://github.com/dylanwal/dictpy/blob/master/tests/test_dict_search.py).
53120

54-
`tree` shows the navigation to get to the data ('.' separated)
55121

56-
`obj` shows the object found
57122

58-
```python
59-
[['Record.Section.TOCHeading', {'TOCHeading': 'Chemical Safety'}],
60-
['Record.Section.Information.Name', {'Name': 'Chemical Safety'}]]
61-
```
123+
### Example
124+
125+
This example will extract data from a JSON for "1-Chloro-2,4-dinitrobenzene" download from
126+
[PubChem](https://pubchem.ncbi.nlm.nih.gov/).
62127

63-
### How to format target
128+
[Example JSON File](https://github.com/dylanwal/dictpy/blob/master/tests/cid_6.json)
64129

65-
You can exact match on `strings`, `int`, `floats` and single line `dictionaries`. Examples:
130+
First, we will load our example above (change "/path/to/data/" to your file location for the file above):
66131

67132
```python
68-
{"RecordType": "CID"}
69-
{"RecordNumber": 6}
70-
{"TOCHeading": "Structures"}
71-
2526
72-
"3D Conformer"
133+
import json
134+
135+
with open("C:/path/to/data/cid_6.json", "r") as f:
136+
text = f.read()
137+
json_data = json.loads(text)
138+
139+
print(json_data)
73140
```
74141

75-
You also can do partial dictionary searches with "*" as a wild card.
142+
You will get a massive printout of the 12,000 line JSON file.
76143

77144
```python
78-
{"MoveToTop": "*"}
79-
{"*": "Chemical Safety"}
80-
```
145+
import dictpy
81146

82-
### Options of what is returned
147+
search = dictpy.DictSearch(data=json_data, target={"RecordType": "CID"})
148+
print(search.result)
149+
```
150+
Print out:
151+
```python
152+
[['Record.RecordType', {'RecordType': 'CID'}]]
153+
```
83154

84-
Currently, there are two return options. The exact object (default) or parent object.
85-
To change to parent object, change return function:
155+
Integer search target:
156+
```python
157+
search = dictpy.DictSearch(data=json_data, target=2526)
158+
print(search.result)
159+
```
160+
Print out:
86161
```python
87-
search = DictSearch(data=json_data, target=target, return_func=DictSearch.return_parent_object)
162+
[
163+
['Record.Section.3.Section.1.Section.14.Information.1.Value.Number', 2526],
164+
['Record.Section.3.Section.1.Section.14.Information.1.Value.Number', 2526]
165+
]
88166
```
89167

90168
---
91169

92170
---
93171

94-
## Serialization
172+
## Serialization (Serializer)
173+
174+
`Serializer` is useful for turning custom python classes into JSON compatible dictionaries.
95175

96176
This serialization class is a useful pre-process step for complex custom python class that contain non-JSON serializable
97177
safe objects (Example: datatime objects, custom classes, any classes from other packages, ObjectIDs, etc.)
98178

99179
Inherit `Serializer` in to your custom python class.
100180

101181
```python
102-
from dictpy import Serializer
103182
import json
183+
import datetime
184+
185+
import dictpy
104186

105-
class Example(Serializer):
187+
class Example(dictpy.Serializer):
106188

107-
def __init__(self, stuff, stuff2, stuff3):
108-
self.stuff = stuff # NOT JSON serializable object
189+
def __init__(self, datetime_obj, stuff2):
190+
self.datetime_obj = datetime_obj # NOT JSON serializable object
109191
self.stuff2 = stuff2
110-
self.stuff3 = stuff3
192+
self.stuff3 = None
193+
111194

112-
example = Example(stuff, stuff2, stuff3)
195+
example = Example(datetime.time(), "stuff2")
113196

114197
# json_output = json.dumps(example) # This will fail with NOT JSON serializable objects
115198

116199
dict_of_example = example.as_dict()
117-
dict_of_example = Serializer.dict_cleanup(dict_of_example) # converts NOT JSON serializable objects to strings.
118-
dict_of_example = Serializer.remove_none(dict_of_example) # Optional: remove None
200+
dict_of_example = dictpy.Serializer.dict_cleanup(dict_of_example) # converts NOT JSON serializable objects to strings.
201+
dict_of_example = dictpy.Serializer.remove_none(dict_of_example) # Optional: remove None; self.stuff3 removed
119202

120203
json_output = json.dumps(dict_of_example)
121204
```

pyproject.toml

+1
Original file line numberDiff line numberDiff line change
@@ -16,6 +16,7 @@ testpaths = [
1616
[coverage]
1717
exclude_lines = [
1818
"pragma: no cover",
19+
"def _str__",
1920
"def __repr__",
2021
"if __name__ == .__main__.:"
2122
]

setup.cfg

+1-1
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
[metadata]
22
name = dictpy
3-
version = 0.0.4
3+
version = 0.0.5
44
description = Searching and Serializing Python Dictionaries/JSON files.
55
long_description = file: README.md
66
long_description_content_type = text/markdown

0 commit comments

Comments
 (0)