7
7
![ downloads] ( https://img.shields.io/pypi/dm/dictpy )
8
8
![ license] ( https://img.shields.io/github/license/dylanwal/dictpy )
9
9
10
+ Advanced tools for Python dictionaries.
10
11
12
+ Included Tools:
11
13
12
- This Python Package can help with finding data in large complex Python dictionaries. These data structures of typical of
13
- JSON Files.
14
-
15
- Additionally, a serialization tool is included to turning custom python classes into JSON compatible dictionaries.
16
-
14
+ * ` DictSearch ` : Search large and complex Python dictionaries/JSON files.
15
+ * ` Serializer ` : Make custom JSON serializable Python classes serializable (make safe for conversion to JSON).
17
16
18
17
## Installation
19
18
20
- Pip installable package available.
19
+ Pip installable package available.
21
20
22
21
` pip install dictpy `
23
22
@@ -26,96 +25,180 @@ Pip installable package available.
26
25
27
26
---
28
27
29
- ## Search Usage
30
-
31
- Load in a JSON file.
32
-
33
- ``` python
34
- import json
28
+ ## Searching (DictSearch)
35
29
36
- with open (data_path, " r " ) as f:
37
- text = f.read()
38
- json_data = json.loads(text)
39
- ```
30
+ Imagine you have some big ugly Python dictionary (like the one produced by [ PubChem ] ( https://pubchem.ncbi.nlm.nih.gov/ )
31
+ when you download the JSON file
32
+ for [ CID 6 ] ( https://pubchem.ncbi.nlm.nih.gov/rest/pug_view/data/compound/6/JSON/?response_type=display ) ) and you want to
33
+ extract some specific piece of information. This section will show how ` DictSearch ` can make this easy.
40
34
41
- Preform search. It will find all valid objects in the search. Use ` .result ` to view results.
35
+ To perform the search we can pass the Python dictionary, and a search target (more discussion below on this) to
36
+ ` DictSearch ` . It will find ** all** valid objects for the search. The results of the search will be stored in ` .result ` .
42
37
43
38
``` python
44
39
import dictpy
45
40
46
- search = DictSearch(data = json_data, target = target)
41
+ search = dictpy. DictSearch(data = json_data, target = target)
47
42
print (search.result)
48
-
43
+
49
44
```
50
45
51
- Example return object:
52
- The return object is a ` list[list[tree, obj]] ` .
46
+ The return object is a ` list[list[tree, obj]] `
47
+
48
+ * ` tree ` : shows the navigation to get to the data ('.' separated)
49
+ * Keys are recorded for dictionaries
50
+ * Integer are recorded for position in lists
51
+ * Example: ` Record.Section.1.Description `
52
+ ``` python
53
+ {" Record" : {
54
+ " Section" : [
55
+ # #####,
56
+ {" Description" : # ####} # A match to the search!
57
+ ]
58
+ }}
59
+ ```
60
+ * `obj` return the object
61
+ * ** Options** :
62
+ * Return current object (default)
63
+ * Returns the object you search for
64
+ * Example:
65
+ * search: `{" dog" : " *" }` ; returns: `{" dog" : " golden retriever" }`
66
+ * search: `" dog" ` ; returns: `{" dog" : " golden retriever" }`
67
+ * search: `{" dog" : " golden retriever" }` ; returns: `{" dog" : " golden retriever" }`
68
+ * Return parent object
69
+ * Returns parent object or whole current level
70
+ * To switch to returning parent objects, change `return_func` .
71
+ ```pyhton
72
+ search = dictpy.DictSearch(data = json_data, target = target, return_func = dictpy.DictSearch.return_parent_object)
73
+ ```
74
+ * Example
75
+ * search: `{" dog" : " *" }` ; returns:
76
+ ```python
77
+ {
78
+ " dog" : " golden retriever" ,
79
+ " cat" : " bangel" ,
80
+ " fish" : " goldfish"
81
+ }
82
+ ```
83
+ * search: `" dog" ` ; returns:
84
+ ```python
85
+ {
86
+ " dog" : " golden retriever" ,
87
+ " cat" : " bangel" ,
88
+ " fish" : " goldfish"
89
+ }
90
+ ```
91
+
92
+ # ## How to format `target`
93
+
94
+ Target can take match accept `strings` , `int ` , `floats` , single line `dictionaries` , and `regex` (regular expression).
95
+ Wild cards(`* ` ) can also be used for partial dictionary searches.
96
+
97
+ Example Targets:
98
+ * `{" RecordType" : " CID" }`
99
+ * Will match exactly to both ' key' , and ' value' (won' t match to list entries)
100
+ * `{" RecordNumber" : 6 }`
101
+ * Will match exactly to both ' key' , and ' value' (won' t match to list entries)
102
+ * With numbers, the default search behavior auto- coverts strings to number.
103
+ * So this would hit to {" RecordNumber" : " 6" }
104
+ * To change this behavior set `op_convert_str_to_num=False `
105
+ * `2526 `
106
+ * Will look for 2526 in either ' key' , ' value' or list entry.
107
+ * `3D Conformer`
108
+ * Will look for " 3D Conformer" in either ' key' , ' value' or list entry.
109
+ * `{" MoveToTop" : " *" }`
110
+ * Will look for " MoveToTop" as a dictionary ' key' and the ' value' can be anything. (won' t match to list entries)
111
+ * `{" *" : " Chemical Safety" }`
112
+ * Will look for " Chemical Safety" as a dictionary ' value' and the ' key' can be anything. (won' t match to list entries)
113
+ * `" ^[A-I]{3} $" `
114
+ * Regular expression search will match in either ' key' , ' value' or list entry.
115
+ * `{" ^RecordT" : " *" }`
116
+ * Regular expression search will match for ' key' and ' value' can be anything. (won' t match to list entries)
117
+
118
+ For more examples see
119
+ [tests/ test_dict_search.py](https:// github.com/ dylanwal/ dictpy/ blob/ master/ tests/ test_dict_search.py).
53
120
54
- ` tree ` shows the navigation to get to the data ('.' separated)
55
121
56
- ` obj ` shows the object found
57
122
58
- ``` python
59
- [[ ' Record.Section.TOCHeading ' , { ' TOCHeading ' : ' Chemical Safety ' }],
60
- [ ' Record.Section.Information.Name ' , { ' Name ' : ' Chemical Safety ' }]]
61
- ```
123
+ # ## Example
124
+
125
+ This example will extract data from a JSON for " 1-Chloro-2,4-dinitrobenzene " download from
126
+ [PubChem](https: // pubchem.ncbi.nlm.nih.gov / ).
62
127
63
- ### How to format target
128
+ [Example JSON File](https: // github.com / dylanwal / dictpy / blob / master / tests / cid_6.json)
64
129
65
- You can exact match on ` strings ` , ` int ` , ` floats ` and single line ` dictionaries ` . Examples :
130
+ First, we will load our example above (change " /path/to/data/ " to your file location for the file above) :
66
131
67
132
```python
68
- {" RecordType" : " CID" }
69
- {" RecordNumber" : 6 }
70
- {" TOCHeading" : " Structures" }
71
- 2526
72
- " 3D Conformer"
133
+ import json
134
+
135
+ with open (" C:/path/to/data/cid_6.json" , " r" ) as f:
136
+ text = f.read()
137
+ json_data = json.loads(text)
138
+
139
+ print (json_data)
73
140
```
74
141
75
- You also can do partial dictionary searches with " * " as a wild card .
142
+ You will get a massive printout of the 12 , 000 line JSON file .
76
143
77
144
```python
78
- {" MoveToTop" : " *" }
79
- {" *" : " Chemical Safety" }
80
- ```
145
+ import dictpy
81
146
82
- ### Options of what is returned
147
+ search = dictpy.DictSearch(data = json_data, target = {" RecordType" : " CID" })
148
+ print (search.result)
149
+ ```
150
+ Print out:
151
+ ```python
152
+ [[' Record.RecordType' , {' RecordType' : ' CID' }]]
153
+ ```
83
154
84
- Currently, there are two return options. The exact object (default) or parent object.
85
- To change to parent object, change return function:
155
+ Integer search target:
156
+ ```python
157
+ search = dictpy.DictSearch(data = json_data, target = 2526 )
158
+ print (search.result)
159
+ ```
160
+ Print out:
86
161
```python
87
- search = DictSearch(data = json_data, target = target, return_func = DictSearch.return_parent_object)
162
+ [
163
+ [' Record.Section.3.Section.1.Section.14.Information.1.Value.Number' , 2526 ],
164
+ [' Record.Section.3.Section.1.Section.14.Information.1.Value.Number' , 2526 ]
165
+ ]
88
166
```
89
167
90
168
-- -
91
169
92
170
-- -
93
171
94
- ## Serialization
172
+ # # Serialization (Serializer)
173
+
174
+ `Serializer` is useful for turning custom python classes into JSON compatible dictionaries.
95
175
96
176
This serialization class is a useful pre- process step for complex custom python class that contain non- JSON serializable
97
177
safe objects (Example: datatime objects, custom classes, any classes from other packages, ObjectIDs, etc.)
98
178
99
179
Inherit `Serializer` in to your custom python class .
100
180
101
181
```python
102
- from dictpy import Serializer
103
182
import json
183
+ import datetime
184
+
185
+ import dictpy
104
186
105
- class Example (Serializer ):
187
+ class Example(dictpy. Serializer):
106
188
107
- def __init__ (self , stuff , stuff2 , stuff3 ):
108
- self .stuff = stuff # NOT JSON serializable object
189
+ def __init__ (self , datetime_obj , stuff2):
190
+ self .datetime_obj = datetime_obj # NOT JSON serializable object
109
191
self .stuff2 = stuff2
110
- self .stuff3 = stuff3
192
+ self .stuff3 = None
193
+
111
194
112
- example = Example(stuff, stuff2, stuff3 )
195
+ example = Example(datetime.time(), " stuff2" )
113
196
114
197
# json_output = json.dumps(example) # This will fail with NOT JSON serializable objects
115
198
116
199
dict_of_example = example.as_dict()
117
- dict_of_example = Serializer.dict_cleanup(dict_of_example) # converts NOT JSON serializable objects to strings.
118
- dict_of_example = Serializer.remove_none(dict_of_example) # Optional: remove None
200
+ dict_of_example = dictpy. Serializer.dict_cleanup(dict_of_example) # converts NOT JSON serializable objects to strings.
201
+ dict_of_example = dictpy. Serializer.remove_none(dict_of_example) # Optional: remove None; self.stuff3 removed
119
202
120
203
json_output = json.dumps(dict_of_example)
121
204
```
0 commit comments