Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chifra export parallelizing reconciliations #1961

Open
tjayrush opened this issue Dec 12, 2021 · 1 comment
Open

chifra export parallelizing reconciliations #1961

tjayrush opened this issue Dec 12, 2021 · 1 comment

Comments

@tjayrush
Copy link
Member

We're finding that reconciliations are the slowest part of a full extraction (as we already knew).

One of the reasons for this is that reconciliations are sequential. We need the calculated balance at the end of one transaction before we can reconcile the current transaction.

This has two consequences.

  1. It forces sequential reconciliation,
  2. It makes backward reconciliation difficult (so we can't really show a reverse chronological view)

Both of the problems are made easier if we break reconcilations into two paths:

First, reconcile the internals of a transaction nodeBegBal + income - outflow == nodeEndBal (this should always reconcile, so it's really only a reading of the values from the chain). We can do this step in a highly parallel way as each transaction's internal reconciliation is independent of every other.

The second pass is to reconcile beginning balances of one transaction with the previous transaction's ending balance (or, if going backwards, the current transaction's ending balance with the next transaction's beginning balance).

This solves both problems and should greatly speed up the reconciliations.

We could even go so far as to do this in batches, say by month heading backwards, so we can deliver the results to the front end in a more timely way.

@tjayrush tjayrush changed the title Parallelizing Reconciliations Export: Parallelizing Reconciliations Feb 23, 2022
@tjayrush tjayrush changed the title Export: Parallelizing Reconciliations chifra export: Parallelizing Reconciliations May 29, 2022
@tjayrush tjayrush changed the title chifra export: Parallelizing Reconciliations chifra export parallelizing reconciliations Oct 18, 2022
@tjayrush
Copy link
Member Author

tjayrush commented Nov 7, 2022

Also, we can write the reconciliations to cache on the "first pass calculate internal reconciliation" and then in the pipeline that handles inter-transaction reconciliation, read from cache. In this way, we don't have to store the entire thing in memory which means we can, in effect, stream.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant