Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chifra export reconciliation improvement proposal in light of inter-block transactions and #1961 #2039

Closed
0xAdil opened this issue Jan 30, 2022 · 2 comments

Comments

@0xAdil
Copy link

0xAdil commented Jan 30, 2022

After landing on an impossiblish case with inter-block transactions (tx.value not accounted in statement), I investigated the reconciliation.cpp library. Following will not cover this bizarre netAmount leakage, because I haven't reproduced it outside of my address, instead otherwise improvement of the library and statement creation.

For perspective, I redefined the reconciled() to inputs = outputs as such:

// condition 1 
begBal + totalIn == totalOut + endBal
// condition 2 i.e. no missing transactions between this and last
begBal == prevRecon.endBal

In case of prevdiff-partial and partial-partial i.e. all but the last IBT, condition 1 is forced to be truthy by substitution of endBal.

This may be desirable, however we may rationalize IBT reconciliation by stating that without knowledge of balance at the end of a transaction, only one of the following is most reasonable case:

  1. Treated as a group of transactions, by reducing the inputs and outputs to aggregate. Final reconciliation applies consistently to all IBTs.
  2. Reconciliation is set to null for all IBTs.

First option is not contrary to what is already being done by carry forward of available outputs of a transaction, however the implementation makes the resulting reconciled boolean inconsistent, as burden is shifted to last IBT.

This combined with prior motivation of parallelizing and my perpetual state of a refactorer, I am proposing an algorithm with some overlay structure. Didn't attempt observables, it seems anti-pattern to the codebase. I believe this opens up possibilities mentioned in #1961 and more with flexibility. It can also be conveniently extended to support the above cases for IBTs if that is feasible.

My C is rusty (no pun intended), following is an attempt at reference implementation while annihilating pointers.

typedef enum {
    PENDING, // default state
    WAITING, // waiting for required neighbor recon state
    INITIALIZED, // recon calculations are initialized
    COMPLETED // recon calculations are completed
} e_recon_status;

typedef enum {
    GENESIS,
    REGULAR,
    TRACE,
    PREVDIFF_PARTIAL,
    PARTIAL_PARTIAL,
    PARTIAL_NEXTDIFF,
} e_recon_type;

// metatype purposed for bare minimum initializer to reduce waiting time for neighbor
// prior to creating Reconciliation object
// alternatively this can be replaced with embedding status into Reconciliation class but this way keeps data flow into it from one scope

struct ReconState {
    uint32 blockNumber;
    uint32 transactionIndex;
    uint32 timestamp;
    e_recon_status status;
    uint32 lastBlockEndBal;
    uint32 endBal;
    e_recon_type reconType;
    CReconciliation* recon;
    ReconState* prev;
    ReconState* next;
};

////
// inside reconciliation.cpp

void Reconciliation::reconcileEth(
    const ReconState* reconState,
    const CTransaction* trans,
    const CAccountName& accountedFor
    ) {
    // to be stringified through a preset map<e_recon_type, char*>
    // this may only be updated later in case of trace
    reconciliationType = reconState->reconType;

    // same implementation using reconState instead of prevRecon, some things to consider:
    // * trace method should probably be called definitely for IBTs and after setting their specific begBal, endBal
}

////
// inside the caller module 

// called upon availability of both appearance block balance, last block balance and also transaction data
void  reconcilliationCallback(ReconState reconState*, const uint32 blkBal, const uint32 lastBlkBal, const CTransaction* trans) {

    // set these immediately to enable other callbacks to succeed themselves
    reconState->endBal = blkBal;
    reconState->lastBlockEndBal = lastBlkBal;

    bool prevDifferent = reconState->prev->blockNumer != reconState->blockNumber;
    bool nextDifferent = reconState->blockNumber != reconState->next->blockNumber;
    
    // set reconType as in current lib
    // ...
    if (reconState->blockNumber == 0) {
        reconState->reconType = e_recon_type.GENESIS;

    } else {
        if (prevDifferent && nextDifferent) {
            reconState->reconType = e_recon_type.REGULAR;

        } else if (prevDifferent) {
            reconState->reconType = e_recon_type.PREVDIFF_PARTIAL;
        
        } else if (nextDifferent) {    
            reconState->reconType = e_recon_type.PARTIAL_NEXTDIFF;
        
        } else {
            reconState->reconType = e_recon_type.PARTIAL_PARTIAL;
        }
    }

    // potentially some locking around here

    reconState->status = e_recon_status.WAITING;

    CReconciliation currEth;
    CReconciliation nextEth;

    bool isDependentIBT = (int)reconType == e_recon_type.PARTIAL_PARTIAL || (int)reconType == e_recon_type.PARTIAL_NEXTDIFF;

    // call yourself if previous recon has loaded required data
    // first / last tx handling to go here
    if(reconState->status == e_recon_status.WAITING && ( !isDependentIBT
        ? reconState->prev->status != e_recon_status.PENDING
        : reconState->prev->status == e_recon_status.COMPLETED)
    ) {
        reconState->status = e_recon_status.INITIALIZED;
        currEth(reconState->blockNumber, reconState->transactionIndex, reconState->timestamp);
        curr.reconcileEth(reconState, ...);
        reconState->recon = currEth;
        reconState->status = e_recon_status.COMPLETED;
    }

    // call the next one if it's waiting
    if(reconState->status == e_recon_status.WAITING) {
        // same as above for nextEth using nextReconState
    }
}

// whichever concurrent logic for hitting RPC or pulling cache, depending on resource params, could be naive chunking or smart chunking

void iterator() {
    // every [blkBal, lastBlkBal, tx] passes to reconciliationCallback(...)
    // and even other per appearance aggregation processes
}

The iterator is a minimal refactor of process_reconciliation() or perhaps beyond that in context of #1790. I haven't audited these files in detail.

I might attempt a js implementation which can partially simulate and validate this, if it's feasible for the state of trueblocks, I could try a PR.

@tjayrush tjayrush changed the title Reconciliation improvement proposal in light of inter-block transactions and #1961 Export: Reconciliation improvement proposal in light of inter-block transactions and #1961 Feb 23, 2022
@tjayrush tjayrush changed the title Export: Reconciliation improvement proposal in light of inter-block transactions and #1961 chifra export: Reconciliation improvement proposal in light of inter-block transactions and #1961 May 29, 2022
@tjayrush tjayrush changed the title chifra export: Reconciliation improvement proposal in light of inter-block transactions and #1961 chifra export reconciliation improvement proposal in light of inter-block transactions and #1961 Oct 18, 2022
@tjayrush
Copy link
Member

tjayrush commented Nov 4, 2022

This is astonishing. Thank you so much for writing this up. I am finally (after nearly a year) getting back to re-writing the reconciliation code. This current re-write won't be the final one -- the final one will be in the GoLang code, but I will take these ideas into both the current re-write (which probably won't implement them) and certainly into the future GoLang code.

I love that your suggestion address the parallization issue mentioned in the linked issue. That's an important part of this re-write as is reversibility.

Per-transaction balances are coming to the Erigon node, so, while your suggestion is appreciated, I think, at least for Erigon, we will be able to directly query end-of-transaction balances. For non-Erigon nodes, we may fall back to something similar to the above.

I'll leave this open until it's fully resolved, and thanks again for taking the time to write this up.

@tjayrush
Copy link
Member

tjayrush commented Nov 8, 2022

I'm going to close this, as it has been fully addressed in the forthcoming code. We chose to continue to reconcile inter-block transactions in the way we do (where we "simulate" the ending balances of transactions with other transactions following in the same block and simulate beginning balance for transactions with previous transactions in the same block. This seems reasonable and allows us to continue to say we balance every transaction. Given that the last transaction in a block will balance, we can conclude that every intermediate value was correct.

Closing....

@tjayrush tjayrush closed this as completed Nov 8, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants