Skip to content

Commit

Permalink
feat(docs): update docs (#567)
Browse files Browse the repository at this point in the history
* update usage/

* accounts-db fixes

* finish update db docs

* fix geyser docs

* fix other docs

* fix geyser docs

* fix geyser docs

* more fixes

* fix main readme

* fix

* fix docs
  • Loading branch information
0xNineteen authored Feb 14, 2025
1 parent 3f6ab00 commit 50a4f20
Show file tree
Hide file tree
Showing 40 changed files with 1,102 additions and 3,882 deletions.
13 changes: 7 additions & 6 deletions docs/check.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,20 +4,19 @@
import generate as g

# checks if the docs folder is up to date with the source readme.md files
# NOTE: only supports either `python docs/check.py .` OR `python check.py ../`
if __name__ == "__main__":
arg_parser = argparse.ArgumentParser()
arg_parser.add_argument("src_dir")
args = arg_parser.parse_args()

exclude_dirs = [
args.src_dir + "docs", # dont search yourself
args.src_dir + "data", # this should only include data
os.path.join(args.src_dir, "docs"), # dont search yourself
os.path.join(args.src_dir, "data"), # this should only include data
]

code_path = os.path.join(args.src_dir, "docs/docusaurus/docs/code")
for name, src_path in g.get_markdown_files(args.src_dir, exclude_dirs):
docs_path = os.path.join(code_path, name + ".md")

doc_dir_path = os.path.join(args.src_dir, "docs/docusaurus/docs")
for src_path, docs_path in g.get_markdown_files(args.src_dir, exclude_dirs, doc_dir_path):
# check to see if the files are the same !
with open(src_path, "r") as src_f:
with open(docs_path, "r") as docs_f:
Expand All @@ -38,3 +37,5 @@
print("Docs:", docs_lines[i])
break
exit(1)

print("Docs folder is up to date!")
Original file line number Diff line number Diff line change
@@ -1,9 +1,9 @@
---
sidebar_position: 4
title: Join Us
sidebar_position: 10
title: Join The Team
---

If you are a talented engineer who thrives in a collaborative and fast-paced environment,
If you are a talented engineer who thrives in a collaborative and fast-paced environment,
and you're excited about contributing to the advancement of Solana's ecosystem, we would love to hear from you.

See our current openings [here](https://jobs.ashbyhq.com/syndica).
381 changes: 157 additions & 224 deletions docs/docusaurus/docs/code/accountsdb.md

Large diffs are not rendered by default.

104 changes: 69 additions & 35 deletions docs/docusaurus/docs/code/geyser.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,80 +5,114 @@ its only used to stream accounts from a snapshot while loading/valdiating a snap

The main code is located in `/src/geyser/`.

`lib.zig` and contains a few key structs:
`lib.zig` contains a few key structs:
- `GeyserWriter`: used to write new accounts
- `GeyserReader`: used to read new accounts

both use linux pipes to stream data. this involves
Linux pipes are used to stream data. this involves
opening a file-based pipe using the `mkfifo` syscall which is then
written to like any other file. the key method used to setup
written to like any other file. The key method used to setup
the pipes is `openPipe` in `src/geyser/core.zig`.

## cli commands
## Usage

while running, grafana stats will be available. the main binary code is in
`src/geyser/main.zig`

### benchmarking

we also have benchmarking to measure the throughput of geyser. you can run it using
The main binary code is in `src/geyser/main.zig` and can be used to
read accounts during account validation using:

```bash
zig build -Doptimize=ReleaseSafe
./zig-out/bin/benchmark geyser
```

you can also benchmark an dummy reader
# run the snapshot validator with geyser enabled
./zig-out/bin/sig snapshot-validate --enable-geyser &

```bash
# in terminal 1 -- read the snapshot accounts to geyser
./zig-out/bin/sig snapshot-validate -g data/genesis-files/testnet_genesis.bin --enable-geyser -a 250 -t 2
# run the geyser reader which dumps the accounts to a csv file
./zig-out/bin/geyser csv

# in terminal 2 -- benchmark how fast you can read
./zig-out/bin/geyser benchmark
```

### dump a snapshot to csv
Metrics are also available to understand how fast data is being written/read which can
be viewed from the grafana dashboard (see `metrics` information for more details).

after downloading a snapshots, you can dump the accounts to a csv using the
csv geyser command, for example:
## Csv File Dumping

After downloading a snapshots, you can dump the accounts to a csv using the
`csv` geyser command:

```bash
# in terminal 1 -- read the snapshot accounts to geyser
./zig-out/bin/sig snapshot-validate -g data/genesis-files/testnet_genesis.bin --enable-geyser -a 250 -t 2
./zig-out/bin/sig snapshot-validate -n testnet --enable-geyser

# in terminal 2 -- dump accounts to a csv (validator/accounts.csv)
./zig-out/bin/geyser csv
# OR dump only specific account owners (ie, the drift program)
./zig-out/bin/geyser csv -o dRiftyHA39MWEi3m9aunc5MzRF1JYuBsbn6VPcn33UH
```

## Architecture
## Benchmarks

### how data is written/read
We also have benchmarking to measure the throughput of geyser. you can run it using

currently, data is serialized and written through the pipe using `bincode`
```bash
zig build -Doptimize=ReleaseSafe
./zig-out/bin/benchmark geyser
```

data is organized to be written as `[size, serialized_data]`
*Note*: due to output formatting, geyser results are off by default to turn them on
you will need to change the logger to use `debug` level.

where `size` is the full length of the `serialized_data`
```zig
var std_logger = sig.trace.DirectPrintLogger.init(
allocator,
- .info,
+ .debug,
);
```

You can also benchmark an dummy reader with production data:

```bash
# run the snapshot validator with geyser enabled
./zig-out/bin/sig snapshot-validate --enable-geyser &

# benchmark how fast you can read (data is read and discarded)
./zig-out/bin/geyser benchmark
```

this allows for more efficient buffered reads where you can read the first 8 bytes in
## Architecture

### How Data is Read/Written

Currently, data is serialized and written through the pipe using `bincode` as it is simple and efficient
encoding format in the repo (future work can use faster encoding schemes if its required).

Data is organized to be written as `[size, serialized_data]`
where `size` is the full length of the `serialized_data`.

This allows for more efficient buffered reads where you can read the first 8 bytes in
the pipe, cast to a u64, allocate a buffer of that size and then read the rest of
the data associated with that payload.
the data associated with that payload:

the key struct used is `AccountPayload` which uses a versioned system to support different payload types (`VersionedAccountPayload`) while also being backwards compatibility.
```zig
/// reads a payload from the pipe and returns the total bytes read with the data
pub fn readPayload(self: *GeyserReader) !struct { u64, VersionedAccountPayload } {
const len = try self.readType(u64, 8);
const versioned_payload = try self.readType(VersionedAccountPayload, len);
return .{ 8 + len, versioned_payload };
}
```

### GeyserWriter
The key struct used is `AccountPayload` which uses a versioned system to support
different payload types (`VersionedAccountPayload`) while also being backwards compatibility.

![](/img/2024-08-07-17-27-36.png)
### Geyser Writer

#### IO Thread

the writer uses a separate thread to write to the pipe due to expensive i/o operations.
The GeyserWriter uses a separate thread to write to the pipe due to expensive i/o operations.
to spawn this thread, use the `spawnIOLoop` method.

it loops, draining the channel for payloads with type (`[]u8`) and then writes the bufs to the pipe and then frees the payload using the `RecycleFBA`
it loops, draining the channel for payloads with type (`[]u8`) and then writes the bufs to the
pipe and then frees the payload using the `RecycleFBA`.

#### RecycleFBA

Expand All @@ -102,7 +136,7 @@ records field.

when free is called, we find the buffer in the records and set the record's `is_free = true`.

#### usage
#### Usage

```zig
// setup writer
Expand Down
86 changes: 67 additions & 19 deletions docs/docusaurus/docs/code/gossip.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,39 +8,87 @@ For an introduction to Solana's gossip protocol, check out the technical section

Checkout the full engineering blog post here: [https://blog.syndica.io/sig-engineering-1-gossip-protocol/](https://blog.syndica.io/sig-engineering-1-gossip-protocol/).

## Repository File Outline

- `service.zig`: main logic for reading, processing, and sending gossip messages
The main struct files include:
- `service.zig`: reading, processing, and sending gossip messages
- `table.zig`: where gossip data is stored
- `data.zig`: various gossip data structure definitions
- `data.zig`: various gossip data definitions
- `pull_request.zig`: logic for sending pull *requests*
- `pull_response.zig`: logic for sending pull *responses* (/handling incoming pull requests)
- `gossip_shards.zig`: datastructure which stores gossip data hashes for quick lookup - used in `gossip_table` and constructing pull responses
- `gossip_shards.zig`: datastructure which stores gossip data hashes for quick lookup (used in `gossip_table` and constructing pull responses)
- `active_set.zig`: logic for deriving a list of peers to send push messages to
- `ping_pong.zig`: logic for sending ping/pong messages as a heartbeat check

A gossip spy is, in essence, software written to do two things: store data and send/receive requests.
Other files include:
- `fuzz_service.zig`: a fuzzing client for testing the gossip service
- `fuzz_table.zig`: a fuzzing client for testing the gossip table

## Usage

Simple usage of the gossip service is as follows:

```zig
const service = try GossipService.create(
// general allocator
std.heap.page_allocator,
// allocator specifically for gossip values
std.heap.page_allocator,
// information about the current node to share with the network (via gossip)
contact_info,
// keypair for signing messages
my_keypair,
// entrypoints to discover peers
entrypoints,
// logger
logger,
);
// start the gossip service (ie, spin up the threads
// to process and generate messages)
try service.start(.{
.spy_node = false,
.dump = false,
});
```

*Note:* a `spy_node` is a node that listens to gossip messages but does not send any.
This is useful for debugging and monitoring the network.

*Note:* `dump` is a flag to print out the gossip table to a file every 10 seconds
(see `dump_service.zig` for more).

*Note:* for an easy to use example, see `initGossipFromCluster` in `helpers.zig`.

## Benchmarks

benchmarks are located at the bottom of `service.zig`.

to run the benchmarks:
- build sig in `ReleaseSafe` (ie, `zig build -Doptimize=ReleaseSafe`)
- run `./zig-out/bin/benchmark gossip`
Benchmarks are located at the bottom of `service.zig`:
- `BenchmarkGossipServiceGeneral`: benchmarks ping, push, and pull response
messages
- `BenchmarkGossipServicePullRequest`: benchmarks pull request messages (which require
a bit more work to construct)

this includes processing times for pings, push messages, pull responses, and
pull requests.
You can run both benchmarks using: `./zig-out/bin/benchmark gossip`.

## Fuzzing

the fuzzing client is located in `fuzz.zig`.
We support two fuzzing options:
- `fuzz_service.zig`: fuzzing the gossip service
- `fuzz_table.zig`: afuzzing the gossip table

### Fuzzing the Service

```bash
zig build -Dno-run fuzz

fuzz gossip_service <seed> <number_of_actions>
```

### Fuzzing the Table

```bash
zig build -Dno-run fuzz

to run the client
- start a sig gossip in a terminal (ie, listening on `8001`)
- build the fuzz client in `ReleaseSafe` (ie, `zig build -Doptimize=ReleaseSafe`)
- run the fuzz client pointing to sig with some seed and some number of random messages
to send: `./zig-out/bin/fuzz <entrypoint> <seed> <num_messages>` (eg, `./zig-out/bin/fuzz 127.0.0.1:8001 19 100000`)
fuzz gossip_table <seed> <number_of_actions>
```

## Architecture

Expand Down
16 changes: 8 additions & 8 deletions docs/docusaurus/docs/contributing/_category_.json
Original file line number Diff line number Diff line change
@@ -1,10 +1,10 @@
{
"position": 4,
"label": "Contributing",
"collapsible": true,
"collapsed": true,
"className": "red",
"customProps": {
"description": "How to contribute to this project"
}
"position": 5,
"label": "Contributing",
"collapsible": true,
"collapsed": true,
"className": "red",
"customProps": {
"description": "Tools to use throughout the repository"
}
}
47 changes: 0 additions & 47 deletions docs/docusaurus/docs/contributing/dev-tools.mdx

This file was deleted.

Loading

0 comments on commit 50a4f20

Please sign in to comment.