- Remove rules deprecations introduced in
v4.0.0
- changes rule language of
selective_extractor
,pseudonymizer
,pre_detector
to support multiple outputs
- Add
string_splitter
processor to split strings of variable length into lists - Add
ip_informer
processor to enrich events with ip information - Allow running the
Pipeline
in python without input/output connectors - Add
auto_rule_corpus_tester
to test a whole rule corpus against defined expected outputs. - Add shorthand for converting datatypes to
dissector
dissect pattern language - Add support for multiple output connectors
- Bump
attrs
to>=22.2.0
and delete redundantmin_len_validator
- Specify the metric labels for connectors (add name, type and direction as labels)
- Rename metric names to clarify their meanings (
logprep_pipeline_number_of_warnings
tologprep_pipeline_sum_of_processor_warnings
andlogprep_pipeline_number_of_errors
tologprep_pipeline_sum_of_processor_errors
)
- Fixes a bug that breaks templating config and rule files with environment variables if one or more variables are not set in environment
- Fixes a bug for
opensearch_output
andelasticsearch_output
not handling authentication issues - Fix metric
logprep_pipeline_number_of_processed_events
to actually count the processed events per pipeline
- reimplements the
selective_extractor
- drop support for python
3.6
,3.7
,3.8
- change default prefix behavior on appending to strings of
dissector
- Add an
http input connector
that spawns a uvicorn server which parses requests content to events. - Add an
file input connector
that reads generic logfiles. - Provide the possibility to consume lists, rules and configuration from files and http endpoints
- Add
requester
processor that enriches by making http requests with field values - Add
calculator
processor to calculate with or without field values - Make output subfields of the
geoip_enricher
configurable by introducing the rule configcustomize_target_subfields
- Add a
timestamp_differ
processor that can parse two timestamps and calculate their respective time delta. - Add
config_refresh_interval
configuration option to refresh the configuration on a given timedelta - Add option to
dissector
to use a prefix pattern in dissect language for appending to strings and add the default behavior to append to strings without any prefixed separator
- Add support for python
3.10
and3.11
- Add option to submit a template with
list_search_base_path
config parameter inlist_comparison
processor - Add functionality to
geoip_enricher
to download the geoip-database - Add ability to use environment variables in rules and config
- Add list access including slicing to dotted field notation for getting values
- Add processor boilerplate generator to help adding new processors
- Fix count of
number_of_processed_events
metric ininput
connector. Will now only count actual events.
- Splitting the general
connector
config intoinput
andoutput
to compose connector config independendly - Removal of Deprecated Feature: HMAC-Options in the connector consumer options have to be
under the subkey
preprocessing
of theinput
processor - Removal of Deprecated Feature:
delete
processor was renamed todeleter
- Rename
writing_output
connector tojsonl_output
- Add an opensearch output connector that can be used to write directly into opensearch.
- Add an elasticsearch output connector that can be used to write directly into elasticsearch.
- Split connector config into seperate config keys
input
andoutput
- Add preprocessing capabillities to all input connectors
- Add preprocessor for log_arrival_time
- Add preprocessor for log_arrival_timedelta
- Add metrics to connectors
- Add
concatenator
processor that can combine multiple source fields - Add
dissector
processor that tokinizes messages into new or existing fields - Add
key_checker
processor that checks if all dotted fields from a list are present in the event - Add
field_manager
processor that copies or moves fields and merges lists - Add ability to delete source fields to
concatenator
,datetime_extractor
,dissector
,domain_label_extractor
,domain_resolver
,geoip_enricher
andlist_comparison
- Add ability to overwrite target field to
datetime_extractor
,domain_label_extractor
,domain_resolver
,geoip_enricher
andlist_comparison
- Validate connector config on class level via attrs classes
- Implement a common interface to all connectors
- Refactor connector code
- Revise the documentation
- Add
sphinxcontrib.datatemplates
andtestcase-renderer
to docs - Reimplement
get_dotted_field_value
helper method which should lead to increased performance - Reimplement
dropper
processor code to improve performance
datetime_extractor.datetime_field
is deprecated. Usedatetime_extractor.source_fields
as list instead.datetime_extractor.destination_field
is deprecated. Usedatetime_extractor.target_field
instead.delete
is deprecated. Usedeleter.delete
instead.domain_label_extractor.target_field
is deprecated. Usedomain_label_extractor.source_fields
as list instead.domain_label_extractor.output_field
is deprecated. Usedomain_label_extractor.target_field
instead.domain_resolver.source_url_or_domain
is deprecated. Usedomain_resolver.source_fields
as list instead.domain_resolver.output_field
is deprecated. Usedomain_resolver.target_field
instead.drop
is deprecated. Usedropper.drop
instead.drop_full
is deprecated. Usedropper.drop_full
instead.geoip_enricher.source_ip
is deprecated. Usegeoip_enricher.source_fields
as list instead.geoip_enricher.output_field
is deprecated. Usegeoip_enricher.target_field
instead.label
is deprecated. Uselabeler.label
instead.list_comparison.check_field
is deprecated. Uselist_comparison.source_fields
as list instead.list_comparison.output_field
is deprecated. Uselist_comparison.target_field
instead.pseudonymize
is deprecated. Usepseudonymizer.pseudonyms
instead.url_fields is
deprecated. Usepseudonymizer.url_fields
instead.
- Fix resetting of some metric, e.g.
number_of_matches
.
- Normalizer can now write grok failure fields to an event when no grok pattern matches and if
failure_target_field
is specified in the configuration
- Fix config validation of the preprocessor
version_info_target_field
.
- Add feature to automatically add version information to all events, configured via the
connector > consumer > preprocessing
configuration - Expose logprep and config version in metric targets
- Dry-Run accepts now a single json without brackets for input type
json
- Move the config hmac options to the new subkey
preprocessing
, maintain backward compatibility, but mark old version as deprecated. - Make the generic adder write the SQL table to a file and load it from there instead of loading it from the database for every process of the multiprocessing pipeline. Furthermore, only connect to the SQL database on checking if the database table has changed and the file is stale. This reduces the SQL connections. Before, there was permanently one connection per multiprocessing pipeline active and now there is only one connection per Logprep instance active when accessing the database.
- Fix SelectiveExtractor output. The internal extracted list wasn't cleared between each event, leading to duplication in the output of the processor. Now the events are cleared such that only the result of the current event is returned.
- Add metric for mean processing time per event for the full pipeline, in addition to per processor
- Fix performance of the metrics tracking. Due to a store metrics statement at the wrong position the logprep performance was dramatically decreased when tracking metrics was activated.
- Fix Auto Rule Tester which tried to access processor stats that do not exist anymore.
- Add ability to add fields from SQL database via GenericAdder
- Prometheus Exporter now exports also processor specific metrics
- Add
--version
cli argument to print the current logprep version, as well as the configuration version if found
- Automatically release logprep on pypi
- Configure abstract dependencies for pypi releases
- Refactor domain resolver
- Refactor
processor_stats
tometrics
. Metrics are now collected in separate dataclasses
- Fix processor initialization in auto rule tester
- Fix generation of RST-Docs
- Metrics refactoring:
- The json output format of the previously known status_logger has changed
- The configuration key word is now
metrics
instead ofstatus_logger
- The configuration for the time measurement is now part of the metrics configuration
- The metrics tracking still includes values about how many warnings and errors happened, but not of what type. For that the regular logprep logging should be consolidated.
- Clear matching rules before processing in clusterer
- Add missing sphinxcontrib-mermaid in tox.ini
- Add generic processor interface
logprep.abc.processor.Processor
- Add
delete
processor to be used with rules. - Delete
donothing
processor - Add
attrs
basedConfig
classes for each processor - Add validation of processor config in config class
- Make all processors using python
__slots__
- Add
ProcessorRegistry
to register all processors - Remove plugins feature
- Add
ProcessorConfiguration
as an adapter to create configuration for processors - Remove all specific processor factories in favor of
logprep.processor.processor_factory.ProcessorFactory
- Rewrite
ProcessorFactory
- Automate processor configuration documentation
- generalize config parameter for using tld lists to
tld_lists
fordomain_resolver
,domain_label_extractor
,pseudonymizer
- refactor
domain_resolver
to make code cleaner and increase test coverage
- remove
ujson
dependency because of CVE