Bank transaction imports

Introduction

In LedgerSMB versions 1.3+, there's improved bank reconciliation functionality. This functionality depends on a working import function for bank statements. Additionally, it depends on all transactions having been entered before the reconciliation starts -- or that additional transactions get entered during reconciliation. However, many users have indicated that the process of entering all transactions and payments is a labour intensive process which could do with automation support.

From discussion it seems that basic components of a workflow to import transactions are in place or should have been in place in LedgerSMB:

Parsing bank statements into "meaningful transaction data"
Matching the transaction data with existing payments
Matching remaining transactions with remaining open items to generate payments
Matching unmatched transactions with template transactions

(2) is in place for the bank reconciliation. The bank reconciliation functionality is probably under-utilized, as (1) is available only as custom code (for which there's very little documentation), very few installations are likely to use this functionality. (3) is new functionality that needs to be realized completely.

Infrastructure to extract bank statement data

As a generic implementation for step (1) is missing at the moment, the first step is to draft a design filling this gap. The input should be translated by a parser to output to be used by steps (2) and (3) to reconcile and/or create transactions. (Note that creating transactions is more demanding than "just" reconciling -- as such, more output is required than is currently the case.)

A parser will transform the input(s) according to the definitions below. A parser-instance will be defined for a "format"; a format being a type of input (CSV, XML, ...) including its associated configuration.

Configuration

Each (bank)account will be associated with least one parser-instance (=parser+configuration).

Parser output

The parser transforms the input (file or stream) into an array of hashes with the following keys:

amount (negative for debit/positive for credit)
currency
book date
value date
description
bank account of the counterparty
'id's provided by a payment processor

Parser input

The parser is provided the following data upon input:

bank account number as filter
file or handle/stream to read from (or an array thereof)

Default implementations

Most users will be greatly facilitated by having a CSV/TXT/TAB separated parser. Looking a bit more into the future, a large ISO-20022 migration is upcoming for SWIFT (payment) processing. As such, it's likely the ISO-20022 format will make the application future proof. As much less configuration is likely to be required (due to the standardized format), supporting this format will simplify setup for users.

CSV parser configuration

The CSV parser will support the following configuration items:

field_separator: "\t", ",", ";", ...
decimal_separator: '.', ","
thousands_separator: ",", '.'
has_headers: true/false
column_names: [ 'col_1', 'col_2', 'amount', ...]
debit_credit_indicator: '', { 'debit': 'DT', 'credit': 'CR' }
column_map: { ... }

Note that the column_names configuration is only required when has_headers is false. Also note that the debit_credit_indicator isn't required when the amount output has the correct signs in the source data.

The column map serves to map the columns in the input (presumably provided by the headers in the source file) to the keys expected in the output.

ISO-20022(xml) parser configuration

The XML parser will have the following configuration items:

XPath mapping of the fields in the XML to the output fields
debit_credit_indicator: '', { 'debit': 'DT', 'credit': 'CR' }

`amount` post-processing

Based on the configuration of the parser, post-process the input stream to correct the sign of the amount value.

Matching transaction data with existing payments

Existing procedure

Today, the procedure works as follows:

Estimate which transactions of the same transdate on the payments account can be grouped into "estimated payments", split by type of transaction (ar,ap,gl), memo description and payment batch ("voucher")
1. group transaction lines with the same source field value (in acc_trans), or
2. lacking a source, group lines with the same entity_credit_account (in case of ar or ap records) or reference (in case of a gl record)
Add the estimated payments to the reconciliation set as "transactions to-be-reconciled"
One-by-one add the lines of the bank statement to the reconciliation data set (the output of [1]):
1. If the input source number (scn) is a number only, prefix it with the check-prefix
2. If the (prefixed) source number is provided:
  1. Find the number of to-be-reconciled transactions for which the transdate and source match the 'posting date' and (prefixed) 'scn' from the bank account [original-set]
  2. If the count equals zero (0), create an additional 'to be reconciled' record
  3. If the count equals one (1), update the 'to be reconciled' record, marking it 'cleared'
  4. If the count is higher, do these additional steps:
    1. Find the number of to-be-reconciled transactions for which the transdate, source and amount match the values from the bank account
    2. If the count equals zero (0), find the first match of the original-set (ordered by amount)
      and mark the identified record as 'cleared'
    3. If the count equals one (1), mark the identified record as 'cleared'
    4. If the count is higher, select the first of the records found and mark that one as 'cleared'
3. If the (prefixed) source isn't provided:
  1. Find the number of to-be-reconciled transactions for which transdate and amount match the bank-provided data and the source doesn't start with the check-prefix
  2. If the count equals zero (0), create an additional 'to be reconciled' record
  3. If the count equals one (1), update that record, marking it 'cleared'
  4. If the count is higher, select the first matching record and mark that one as 'cleared'

Proposed replacement procedure

TODO

A pre-condition for this procedure should be that any transactions that have been reversed or are themselves reversals, should be excluded from the procedure matching payments/bank account transactions with the bank data.

Another pre-condition is that we decide how payments are supposed to be corrected or how withheld bank charges should be incorporated into "additional postings" adding to payments.

Options:

Configurable rule-engine applying matching criteria in order of relevance
Pre-coded matching with bayes statistics learning the algorithm which equalities/similarities are more important than others (are being overridden more often than others)

Initial generation of the set of transactions to-be-reconciled

Adding to the set of transactions due to new postings

Dealing with records in the bank statement not in the ledger

Matching remaining transactions with open items to generate payments

TODO

Matching unmatched transactions with transaction templates

The process of matching the remaining transactions with open items (with the goal to create payments) is one that likely requires either explicit matching rules. Other options to map inputs to available items could include Bayesian statistics (like SpamAssassin).

For now, we consider these two options:

Explicitly configured, rule-based matching
Bayesian matching, derived from user classification

Configuration

Explicitly configured rule-based matching

This kind of matching would require extensive configuration, linking a wide range of possible inputs from the bank statement to a set of open items (for which values such as PO #, order # and invoice # differ for each one).

To support this kind of functionality, likely some kind of rule-definition means needs to be developed with an engine being able to execute these rules for each combination of input lines and rules and open items.

Bayesian matching

While this mechanism is assumed to need a lot less configuration (explicit), research is required from the LedgerSMB development team regarding tokenisation of bank account inputs and association of statistics with different (classes of) open items. [Note that these classes are hard to define during development; they're likely dependent on the use-case.]

While there's little configuration to be created explicitly, there's a lot of statistical data to be stored, to be used on the next iteration of reconciliation.

NOTE[1]: What to do when there's a benefit to sharing the statistics across companies?
NOTE[2]: A design is needed for storing and updating the statistical data.
NOTE[3]: Statistical research is required to determine on what level statistics need to be stored (do we need to store stats per counterparty? If not, what other level of granularity makes sense? ECA?)

Input

The process takes as its input one line of the payment input file and looks up from the database all open items before the value/book date of the input line.

Output

The process produces one or more eligible open items for which the input could be a payment. The output will be handled by an operator; in case of multiple eligible items, the operator will select an appropriate one -- in case of a single one, the match will need to be reviewed.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Bank transaction imports

Introduction

Infrastructure to extract bank statement data

Configuration

Parser output

Parser input

Default implementations

CSV parser configuration

ISO-20022(xml) parser configuration

`amount` post-processing

Matching transaction data with existing payments

Existing procedure

Proposed replacement procedure

Initial generation of the set of transactions to-be-reconciled

Adding to the set of transactions due to new postings

Dealing with records in the bank statement not in the ledger

Matching remaining transactions with open items to generate payments

Matching unmatched transactions with transaction templates

Configuration

Explicitly configured rule-based matching

Bayesian matching

Input

Output

Clone this wiki locally

Bank transaction imports

Introduction

Infrastructure to extract bank statement data

Configuration

Parser output

Parser input

Default implementations

CSV parser configuration

ISO-20022(xml) parser configuration

amount post-processing

Matching transaction data with existing payments

Existing procedure

Proposed replacement procedure

Initial generation of the set of transactions to-be-reconciled

Adding to the set of transactions due to new postings

Dealing with records in the bank statement not in the ledger

Matching remaining transactions with open items to generate payments

Matching unmatched transactions with transaction templates

Configuration

Explicitly configured rule-based matching

Bayesian matching

Input

Output

Clone this wiki locally

`amount` post-processing