Skip to content

Commit 0ac04d5

Browse files
LunarLandingkynan
andauthored
Update README.md with git-filter-repo (#194)
Update README.md with git-filter-repo example Fixes #193. --------- Co-authored-by: Florian Rathgeber <[email protected]>
1 parent 883d871 commit 0ac04d5

File tree

1 file changed

+23
-16
lines changed

1 file changed

+23
-16
lines changed

README.md

+23-16
Original file line numberDiff line numberDiff line change
@@ -197,23 +197,30 @@ Note that you need to uninstall with the same flags:
197197
### Apply retroactively
198198

199199
`nbstripout` can be used to rewrite an existing Git repository using
200-
`git filter-branch` to strip output from existing notebooks. This invocation
201-
uses `--index-filter` and operates on all ipynb-files in the repo: :
202-
203-
git filter-branch -f --index-filter '
204-
git checkout -- :*.ipynb
205-
find . -name "*.ipynb" -exec nbstripout "{}" +
206-
git add . --ignore-removal
200+
[`git filter-repo`](https://github.com/newren/git-filter-repo) to strip output
201+
from existing notebooks. This invocation operates on all ipynb files in the repo:
202+
203+
```sh
204+
#!/usr/bin/env bash
205+
# get lint-history with callback from https://github.com/newren/git-filter-repo/pull/542
206+
./lint-history.py --relevant 'return filename.endswith(b".ipynb")' --callback '
207+
import json, warnings, nbformat
208+
from nbstripout import strip_output
209+
from nbformat.reader import NotJSONError
210+
try:
211+
with warnings.catch_warnings():
212+
warnings.simplefilter("ignore", category=UserWarning)
213+
notebook = nbformat.reads(blob.data, as_version=nbformat.NO_CONVERT)
214+
# customize to your needs
215+
strip_output(notebook, keep_output=False, keep_count=False, keep_id=False, extra_keys=["metadata.widgets","metadata.execution","cell.attachments"], drop_empty_cells=True, drop_tagged_cells=[],strip_init_cells=False, max_size=0)
216+
old_len = len(blob.data)
217+
blob.data = (nbformat.writes(notebook) + "\n").encode("utf-8")
218+
if old_len != len(blob.data):
219+
print(change.blob_id, change.filename, old_len, len(blob.data))
220+
except NotJSONError as e:
221+
print("ERROR", type(e), change.blob_id, filename)
207222
'
208-
209-
If the repository is large and the notebooks are in a subdirectory it will run
210-
faster with `git checkout -- :<subdir>/*.ipynb`. You will get a warning for
211-
commits that do not contain any notebooks, which can be suppressed by piping
212-
stderr to `/dev/null`.
213-
214-
This is a potentially slower but simpler invocation using `--tree-filter`:
215-
216-
git filter-branch -f --tree-filter 'find . -name "*.ipynb" -exec nbstripout "{}" +'
223+
```
217224

218225
### Removing empty cells
219226

0 commit comments

Comments
 (0)