Improwing speed and reduce code size when fast_float is using as internal parser code. #307

IRainman · 2025-03-05T20:24:25Z

CONTRIBUTORS File

Change: Added "Elle Solomina" to the list of contributors.
Reason: Acknowledging contributions made by a new collaborator.
README.md

Changes:
Added documentation for using the fast_float library with non-standard options.
Introduced examples illustrating the use of FASTFLOAT_ONLY_POSITIVE_C_NUMBER_WO_INF_NAN macro to optimize code for parsing only positive numbers in specific scenarios.
Reason: To provide detailed instructions and examples for using custom parsing options and macros for performance optimization.
benchmarks/benchmark.cpp

Changes:
Added support for the macro FASTFLOAT_ONLY_POSITIVE_C_NUMBER_WO_INF_NAN and associated conditional code to handle scenarios where only positive numbers are parsed without infinity (inf) or NaN.
Simplified and generalized parsing functions using templates for better type reuse (findmax_fastfloat).
Streamlined the handling of UTF-16 data with emplace_back.
Improved clarity and performance by replacing some variables with constants (const).
Code cleanup and optimization (e.g., reduced unnecessary includes, minor code refactoring).
Reason: Improved performance, code structure, and made the benchmarks more configurable and adaptable to specialized cases.
benchmarks/event_counter.h

Changes:
Replaced std::vector with std::array for storing event counters to improve performance and memory efficiency.
Introduced an enum event_counter_types for better readability when accessing counter types.
General refactoring for improved maintainability and robustness.
Reason: Optimizations for memory and computational efficiency while improving readability.
benchmarks/linux-perf-events.h

Changes:
Replaced std::vector with std::array for storing configuration values.
Refactored method signatures for better clarity and robustness.
Reason: Enhanced performance and reduced overhead by replacing dynamic containers with fixed-size arrays.
include/fast_float/ascii_number.h

Changes:
Introduced conditional compilation for the byteswap function based on platform support.
Minor improvements to loop iteration variables for consistency and efficiency.
Reason: Improved compatibility and performance.

include/fast_float/bigint.h

Changes:
    Introduced type aliases (e.g., bigint_bits_t, am_digits) for better type clarity.
    Replaced uint16_t with new type aliases for readability.
    Refactored container class stackvec to use modern coding practices (e.g., noexcept).
Reason: Improved type safety, readability, and maintainability.

include/fast_float/constexpr_feature_detect.h

Changes:
    Added macros for detecting support for features like byteswap and [[assume]].
Reason: Improved feature detection for cross-platform compatibility.

include/fast_float/decimal_to_binary.h

Changes:
    Modified function templates to use more specific type aliases for better readability and safety.
Reason: Improved maintainability and reduced potential for errors.

include/fast_float/digit_comparison.h

Changes:
    Replaced raw types (e.g., int32_t) with specific type aliases (e.g., am_pow_t) for better clarity.
    Minor refactoring to improve consistency and readability.
Reason: Enhanced code clarity and maintainability.

include/fast_float/float_common.h

Changes:
    Introduced new type aliases for mantissa and power-related types (e.g., am_mant_t, am_pow_t).
    Updated chars_format enum to include additional options and improve compatibility with macros.
Reason: Improved type safety and added flexibility for parsing options.

include/fast_float/parse_number.h

Changes:
    Added conditional compilation to exclude certain checks when FASTFLOAT_ONLY_POSITIVE_C_NUMBER_WO_INF_NAN is defined.
    Refactored functions to use new type aliases and improved readability.
Reason: Simplified parsing logic for specialized use cases and improved code maintainability.

include/fast_float/fast_float.h

Changes:
    Updated function signatures to include const qualifiers for parameters.
Reason: Improved code clarity and ensured immutability where applicable.

tests/

In process...

Additional Improvements

General:
Added or updated inline comments to improve code readability.
Improved consistency in coding style (e.g., use of const and noexcept where applicable).
Enhanced clarity of error messages and assertions in test cases.
Reason: Improved overall quality, readability, and maintainability of the codebase.

Currently, it's a draft.

…_SIGN and FASTFLOAT_DISALLOW_NAN. This both allow to significantly reduce code size and speedup your code when you use fast_float as a part of your parser.

lemire · 2025-03-05T20:28:56Z

@IRainman

This both allow to significantly reduce code size and speedup your code when you use fast_float as a part of your parser.

I am skeptical. Your claims are not documented or quantified.

IRainman · 2025-03-05T20:52:15Z

@IRainman

This both allow to significantly reduce code size and speedup your code when you use fast_float as a part of your parser.

I am skeptical. Your claims are not documented or quantified.

OK, I'll add some additional tests, related to parsing only positive numbers and positive infinite. It's really a common case in any mathematical parser. Maybe I should take a more aggressive approach and disable infinity too. In theory, this defines should not be needed with constexpr if...
Actually, one big question: do you plan to allow the compile time config options? Because current realisation parse options in fully software mode and this is bad.

dalle · 2025-03-05T21:09:11Z

I recently changed some compile time config options into runtime config options. I believe that both worlds would be good. I would prefer if the compile time options to not be any ifdef-macros, but controlled by using template arguments.

lemire · 2025-03-05T22:41:23Z

In this instance, the argument here is to reduce compiled size and improve runtime performance we avoid checking negatives ('-') and for 'nan'/'inf'. In my view, it is an extraordinary claim that checking for the negative ('-') and for 'nan'/'inf' makes a large difference.

It is possibly true, but I'd like to see the numbers. Please note that we have do have benchmarks included in the library.

Because current realisation parse options in fully software mode and this is bad.

How did you benchmark to arrive at the conclusion that it is bad?

I recently changed some compile time config options into runtime config options.

I would not assume that it makes a difference to the performance unless we have a hard data. Some parameters can be passed as template arguments, certainly...

But we need to take into account that parsing a single number takes hundreds of instructions. Predictable branches are not a big concern.

IRainman · 2025-03-06T12:53:12Z

I recently changed some compile time config options into runtime config options. I believe that both worlds would be good. I would prefer if the compile time options to not be any ifdef-macros, but controlled by using template arguments.

Very well! I'm currently reworking my patch to use templates argument and want to add more constexpr / consteval to the code.

IRainman · 2025-03-06T13:07:54Z

In this instance, the argument here is to reduce compiled size and improve runtime performance we avoid checking negatives ('-') and for 'nan'/'inf'. In my view, it is an extraordinary claim that checking for the negative ('-') and for 'nan'/'inf' makes a large difference.

It is possibly true, but I'd like to see the numbers. Please note that we have do have benchmarks included in the library.

I'm measuring performance on the mathematical parser, that already processes minus signs, and I yesterday rewrote it to also process inf / infinity for mathematical questions. In mathematical parser nan isn't possible at all because if it's nan the parser stops processing equations immediately.

Because current realisation parse options in fully software mode and this is bad.

How did you benchmark to arrive at the conclusion that it is bad?

Options it's not fully processed by the template and generate many additional lines of code. Also complete removing the minus sign for mantissa and inf/nan from fast_float significantly reduces generated assembler.

To check the performance of such a case, tests should contain only positive numbers in general notation.

dalle · 2025-03-06T13:22:58Z

I recently changed some compile time config options into runtime config options. I believe that both worlds would be good. I would prefer if the compile time options to not be any ifdef-macros, but controlled by using template arguments.

Very well! I'm currently reworking my patch to use templates argument and want to add more constexpr / consteval to the code.

Please don't do this until we have the performance comparisons on the gains.

…teval, also significantly reduce register pressure by reducing copy of constant data.

IRainman · 2025-03-06T15:39:08Z

I recently changed some compile time config options into runtime config options. I believe that both worlds would be good. I would prefer if the compile time options to not be any ifdef-macros, but controlled by using template arguments.

Very well! I'm currently reworking my patch to use templates argument and want to add more constexpr / consteval to the code.

Please don't do this until we have the performance comparisons on the gains.

Of course, I just cleaned my code of defines and replaced them with an option. Unfortunately, code isn't fully clean-up at compile time as a preprocessing macro.

IRainman · 2025-03-06T15:59:37Z

P. S. my current version of the test code is:
[[assume(_view._Unchecked_end() - _view._Unchecked_begin() >= 1)]]; double val; constexpr auto options{fast_float::chars_format::general | fast_float::chars_format::no_infnan | fast_float::chars_format::disallow_leading_sign}; const auto res = fast_float::from_chars(_view._Unchecked_begin(), _view._Unchecked_end(), val, options); [[assume(res.ptr - _view._Unchecked_begin() >= 1)]]; const size_t n = res.ptr - _view._Unchecked_begin();

… faster and more compact code parsing numbers with input support only positive C/C++ style numbers without nan or inf. That case is very useful in mathematical applications, game development, CSS parsing, embedded code, etc... Additional improve in constant initialization.

…k fine.

include/fast_float/ascii_number.h

…R_WO_INF_NAN is enabled we assume that we are in parser code with external loop that checks bounds. Function cpp20_and_in_constexpr() now is really compile time evaluated. TODO fix warnings.

include/fast_float/float_common.h

…nd FASTFLOAT_SKIP_WHITE_SPACE, please use options. Compilation fix when FASTFLOAT_ONLY_POSITIVE_C_NUMBER_WO_INF_NAN isn't defined.

Added examples of usage FASTFLOAT_ONLY_POSITIVE_C_NUMBER_WO_INF_NAN macros and documented allow_leading_plus and skip_white_space options

dalle · 2025-03-09T00:01:10Z

I'm sorry to interrupt you, but I think you're going the wrong way here and focusing on the wrong stuff. I don't want to seem pompous or ignorant, and I don't want to offend you, I just don't want you to (possibly) waste more time on small changes.

I will say it clearly: Without proof of the gains there is very low chances that PR will be merged.

So please focus on the tests, make the performance/benchmarks to compile and run. Then you should run the performance/benchmarks workflows. If there are quantified improvements then the likelihood of merging this PR will increase substantially.

…n P1774

lemire · 2025-03-09T00:41:55Z

I agree with @dalle.

It is fine to fork the library for your own purposes… it is open source. You can adapt it to your own needs. Please do so.

But we won’t take this PR unless we are convinced of its benefits. And you have not provided the evidence.

We are concerned about maintaining the code, keeping it correct and well tested.

Changes must be motivated.

…POSITIVE_C_NUMBER_WO_INF_NAN. Now benchmark only measure parameters for fast_float::from_chars and nothing else. Copy-past fix.

IRainman · 2025-03-09T17:03:51Z

I added an improvement test with parser emulation.

P. S. In real code of mathematical processing engine, my patches also give significant improvements (this is from MSVC, not tested on the other compilers):

std::from_chars

Tests: exactly: 115, almost: 9,
failed: 4,
time is: 32064ms.

fast_float::from_chars

Tests:
time is: 26684ms.

( commits from this PR )

Tests (100% of CPU cores is used by BOINC):
time is: 43135ms.
Tests (50% of CPU cores is used by BOINC):
time is: 28898ms.
Tests (BOINC is stopped):
time is: 25459ms.

When switch FASTFLOAT_ONLY_POSITIVE_C_NUMBER_WO_INF_NAN
is applied:

Tests: exactly: 115, almost: 9,
failed: 4,
time is: 25947ms.

( commits from this PR )

Tests (100% of CPU cores is used by BOINC):
time is: 39588ms.
Tests (50% of CPU cores is used by BOINC):
time is: 27301ms.
Tests (BOINC is stopped):
time is: 24597ms.

…ssible for the counter. Because the library only support 32 and 64 bit platform we only need 32 bit as a small counter.

include/fast_float/ascii_number.h

…FLOAT_ONLY_POSITIVE_C_NUMBER_WO_INF_NAN.

exponent value is always less than in16_t. original main: Tests: time is: 44278ms. size of my tests 389.0k size of my program 164.0k my main: Tests: time is: 42015ms. size of my tests 389.0k size of my program 164.0k my main with FASTFLOAT_ONLY_POSITIVE_C_NUMBER_WO_INF_NAN Tests: time is: 41282ms. size of my tests 386.5k size of my program 161.5k After this I'll try it on my partner Linux machine with the original tests and compare much better.

* fix for some types errors. * fix small amount of not optimized code. * add more comments to the code. * unified of function binary_format::max_mantissa_fast_path() because it's do the same.

IRainman · 2025-04-12T14:28:15Z

@lemire @dalle , can I compile all CI tests locally on the windows?

* Added a config macro FASTFLOAT_DISALLOW_ANY_LEADING_SYMBOLS_INCLUDE…

b0b1954

…_SIGN and FASTFLOAT_DISALLOW_NAN. This both allow to significantly reduce code size and speedup your code when you use fast_float as a part of your parser.

IRainman added 2 commits March 6, 2025 18:29

* Added an option disallow_leading_sign and stronger constexpr / cons…

63f6abe

…teval, also significantly reduce register pressure by reducing copy of constant data.

Merge branch 'main' of https://github.com/IRainman/fast_float

e35e7c7

IRainman added 4 commits March 6, 2025 19:43

more const

2879564

fix for 32 bit build. completely done. All other parser mode also wor…

7a38e1b

…k fine.

# cleanup.

bc3f331

dalle reviewed Mar 6, 2025

View reviewed changes

include/fast_float/ascii_number.h Outdated Show resolved Hide resolved

IRainman added 2 commits March 7, 2025 14:51

Additional compile time cleanup. When FASTFLOAT_ONLY_POSITIVE_C_NUMBE…

aba93f3

…R_WO_INF_NAN is enabled we assume that we are in parser code with external loop that checks bounds. Function cpp20_and_in_constexpr() now is really compile time evaluated. TODO fix warnings.

fix type conversion warning.

388426e

dalle reviewed Mar 7, 2025

View reviewed changes

include/fast_float/float_common.h Outdated Show resolved Hide resolved

IRainman added 3 commits March 7, 2025 20:39

PVS-Studio founds some errors, I fixed it.

ae29a0d

Completely remove deprecated macroses FASTFLOAT_ALLOWS_LEADING_PLUS a…

f496321

…nd FASTFLOAT_SKIP_WHITE_SPACE, please use options. Compilation fix when FASTFLOAT_ONLY_POSITIVE_C_NUMBER_WO_INF_NAN isn't defined.

Update README.md

ac22204

Added examples of usage FASTFLOAT_ONLY_POSITIVE_C_NUMBER_WO_INF_NAN macros and documented allow_leading_plus and skip_white_space options

IRainman added 2 commits March 9, 2025 03:41

Added FASTFLOAT_ASSUME for support attribute [[assume]] is declared i…

c2daa8a

…n P1774

Merge branch 'main' of https://github.com/IRainman/fast_float

23a1367

IRainman added 2 commits March 9, 2025 04:07

Fix compilation of benchmarks

5097243

benchmark are updated, added AST parser emulation for FASTFLOAT_ONLY_…

a22dfc6

…POSITIVE_C_NUMBER_WO_INF_NAN. Now benchmark only measure parameters for fast_float::from_chars and nothing else. Copy-past fix.

IRainman added 5 commits April 8, 2025 04:06

style cleanup.

27f0265

style fix.

97bfec6

after all sized checks is done I return the minimum registers size po…

b8f7771

…ssible for the counter. Because the library only support 32 and 64 bit platform we only need 32 bit as a small counter.

after all sized checks is done I return the minimum registers size po…

f1b7f49

…ssible for the counter. Because the library only support 32 and 64 bit platform we only need 32 bit as a small counter.

Merge branch 'main' of https://github.com/IRainman/fast_float

a591ca2

IRainman commented Apr 9, 2025

View reviewed changes

include/fast_float/ascii_number.h Show resolved Hide resolved

IRainman added 22 commits April 9, 2025 15:22

reduce size of from_chars_result_t to 4 bytes. Cleanup for usage FAST…

7c96e3a

…FLOAT_ONLY_POSITIVE_C_NUMBER_WO_INF_NAN.

reduce size of from_chars_result_t to 4 bytes. Cleanup for usage FAST…

a081ebe

…FLOAT_ONLY_POSITIVE_C_NUMBER_WO_INF_NAN.

Merge branch 'main' of https://github.com/IRainman/fast_float

8d4ca69

reduce size of from_chars_result_t to 4 bytes. Cleanup for usage FAST…

d32ae04

…FLOAT_ONLY_POSITIVE_C_NUMBER_WO_INF_NAN.

template interface cleanup for min_safe_u64 and max_digits_u64.

c762936

cleanup code generation for parse_mantissa.

bbf4193

trying to fix tests.

2da25b5

trying to fix tests.

6cacae0

fix warnings.

68fe735

fix warnings.

f8625b6

Merge branch 'main' of https://github.com/IRainman/fast_float

a4d1174

additional improve for debug runtime.

e71bfff

benchmark cleanup.

88b3887

* fix errors in the parse_number_string.

563648f

# format

0daee75

try reordering again.

1aed8ee

try reordering again.

f3db77a

* added chars_format_t for performance reason.

b0bae17

try additional part...

69fbbff

* carefully work with types in the library.

ba1344c

* fix for some types errors. * fix small amount of not optimized code. * add more comments to the code. * unified of function binary_format::max_mantissa_fast_path() because it's do the same.

# format.

0a18d6b

* additional types cleanup for speedup and reduce cache pressure.

17ffdff

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improwing speed and reduce code size when fast_float is using as internal parser code. #307

Improwing speed and reduce code size when fast_float is using as internal parser code. #307

IRainman commented Mar 5, 2025 •

edited

Loading

lemire commented Mar 5, 2025

IRainman commented Mar 5, 2025 •

edited

Loading

dalle commented Mar 5, 2025

lemire commented Mar 5, 2025 •

edited

Loading

IRainman commented Mar 6, 2025

IRainman commented Mar 6, 2025

dalle commented Mar 6, 2025

IRainman commented Mar 6, 2025

IRainman commented Mar 6, 2025

dalle commented Mar 9, 2025

lemire commented Mar 9, 2025

IRainman commented Mar 9, 2025 •

edited

Loading

IRainman commented Apr 12, 2025

Improwing speed and reduce code size when fast_float is using as internal parser code. #307

Are you sure you want to change the base?

Improwing speed and reduce code size when fast_float is using as internal parser code. #307

Conversation

IRainman commented Mar 5, 2025 • edited Loading

lemire commented Mar 5, 2025

IRainman commented Mar 5, 2025 • edited Loading

dalle commented Mar 5, 2025

lemire commented Mar 5, 2025 • edited Loading

IRainman commented Mar 6, 2025

IRainman commented Mar 6, 2025

dalle commented Mar 6, 2025

IRainman commented Mar 6, 2025

IRainman commented Mar 6, 2025

dalle commented Mar 9, 2025

lemire commented Mar 9, 2025

IRainman commented Mar 9, 2025 • edited Loading

IRainman commented Apr 12, 2025

IRainman commented Mar 5, 2025 •

edited

Loading

IRainman commented Mar 5, 2025 •

edited

Loading

lemire commented Mar 5, 2025 •

edited

Loading

IRainman commented Mar 9, 2025 •

edited

Loading