You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardexpand all lines: README.md
+21-9
Original file line number
Diff line number
Diff line change
@@ -2,18 +2,32 @@
2
2
3
3
## Overview
4
4
5
-
This Go program is designed to efficiently process a large dataset of temperature readings for different weather stations, as part of the [One Billion Row Challenge](https://github.com/gunnarmorling/1brc). The program reads a text file containing temperature measurements, calculates the minimum, mean, and maximum temperature for each station, and outputs the results to standard output (stdout). Additionally, it measures and displays the total processing time.
5
+
This Go program is designed to efficiently process a large dataset of temperature readings for different weather stations, as part of the [One Billion Row Challenge](https://github.com/gunnarmorling/1brc). The program reads a text file containing temperature measurements, calculates the minimum, mean, and maximum temperature for each station, and outputs the results to stdout. Additionally, it measures and displays the total processing time.
6
6
7
-
## Key Features
7
+
## Key Features (v1.0.0)
8
8
9
-
-**Concurrency:** Uses goroutines for parallel processing, enhancing performance on multi-core processors.
10
-
-**Efficient File Reading:** Employs buffered reading for handling large files effectively.
9
+
-**Concurrency:** Uses goroutines for parallel processing 2 enhance performance on multi-core processors.
10
+
-**Efficient File Reading:** Employs buffered reading for handling the 12 gb of dataset more effectively.
11
11
-**Data Aggregation:** Calculates min, mean, and max temperatures for each station.
12
12
-**Performance Measurement:** Reports the total time taken for processing.
13
13
14
+
Processing Time: 9m21s. Tested with a Ryzen 5800x3d
15
+
16
+
## Recent Optimizations (v1.1.0)
17
+
18
+
The program has undergone several optimizations to improve its processing time:
19
+
20
+
-**Concurrency Model Improved:** Implemented a worker pool pattern for dynamic goroutine management and balanced workload distribution.
21
+
-**Buffered Channels:** Increased channel buffer sizes to reduce blocking and increase throughput.
22
+
-**Batch Processing:** Process multiple lines of data in a single goroutine to reduce overhead.
23
+
-**I/O Enhancements:** Adjusted file reading for larger chunks to reduce I/O bottlenecks.
24
+
25
+
Processing Time: 6m53s. Tested with a Ryzen 5800x3d
26
+
14
27
## Requirements
15
28
16
-
- Go Binaries ofc
29
+
- Go Runtime ofc (1.21)
30
+
- Having the Dataset Up and Ready, see here for further instructions: [One Billion Row Challenge](https://github.com/gunnarmorling/1brc)
17
31
18
32
## How to Run the Program
19
33
@@ -34,14 +48,12 @@ This Go program is designed to efficiently process a large dataset of temperatur
0 commit comments