Skip to content

Commit 3baca01

Browse files
authored
feat: JA4 fingerprinting (#4669)
1 parent e2fcfab commit 3baca01

21 files changed

+2450
-31
lines changed

api/unstable/fingerprint.h

+22-4
Original file line numberDiff line numberDiff line change
@@ -27,12 +27,17 @@
2727
* and marked as stable after an initial customer integration and feedback.
2828
*/
2929

30+
/* Available fingerprinting methods.
31+
*
32+
* The current recommendation is to use JA4. JA4 sorts some of the lists it includes
33+
* in the fingerprint, making it more resistant to the list reordering done by
34+
* Chrome and other clients.
35+
*/
3036
typedef enum {
31-
/*
32-
* The current standard open source fingerprinting method.
33-
* See https://engineering.salesforce.com/tls-fingerprinting-with-ja3-and-ja3s-247362855967.
34-
*/
37+
/* See https://engineering.salesforce.com/tls-fingerprinting-with-ja3-and-ja3s-247362855967 */
3538
S2N_FINGERPRINT_JA3,
39+
/* See https://github.com/FoxIO-LLC/ja4/tree/main */
40+
S2N_FINGERPRINT_JA4,
3641
} s2n_fingerprint_type;
3742

3843
struct s2n_fingerprint;
@@ -99,6 +104,11 @@ S2N_API int s2n_fingerprint_get_hash_size(const struct s2n_fingerprint *fingerpr
99104
* - See https://engineering.salesforce.com/tls-fingerprinting-with-ja3-and-ja3s-247362855967
100105
* - Example: "c34a54599a1fbaf1786aa6d633545a60"
101106
*
107+
* JA4: A string consisting of three parts, separated by underscores: the prefix,
108+
* and the hex-encoded truncated SHA256 hashes of the other two parts of the raw string.
109+
* - See https://github.com/FoxIO-LLC/ja4/blob/v0.18.2/technical_details/JA4.md
110+
* - Example: "t13i310900_e8f1e7e78f70_1f22a2ca17c4"
111+
*
102112
* @param fingerprint The s2n_fingerprint to be used for the hash
103113
* @param max_output_size The maximum size of data that may be written to `output`.
104114
* If `output` is too small, an S2N_ERR_T_USAGE error will occur.
@@ -134,6 +144,14 @@ S2N_API int s2n_fingerprint_get_raw_size(const struct s2n_fingerprint *fingerpri
134144
* 49188-49192-107-49187-49191-103-49162-49172-57-49161-49171-51-157-
135145
* 156-61-60-53-47-255,11-10-35-22-23-13-43-45-51,29-23-30-25-24,0-1-2"
136146
*
147+
* JA4: A string consisting of three parts: a prefix, and two lists of hex values.
148+
* - See https://github.com/FoxIO-LLC/ja4/blob/v0.18.2/technical_details/JA4.md
149+
* - Example: "t13i310900_002f,0033,0035,0039,003c,003d,0067,006b,009c,009d,009e,
150+
* 009f,00ff,1301,1302,1303,c009,c00a,c013,c014,c023,c024,c027,c028,
151+
* c02b,c02c,c02f,c030,cca8,cca9,ccaa_000a,000b,000d,0016,0017,0023,
152+
* 002b,002d,0033_0403,0503,0603,0807,0808,0809,080a,080b,0804,0805,
153+
* 0806,0401,0501,0601,0303,0301,0302,0402,0502,0602"
154+
*
137155
* @param fingerprint The s2n_fingerprint to be used for the raw string
138156
* @param max_output_size The maximum size of data that may be written to `output`.
139157
* If `output` is too small, an S2N_ERR_T_USAGE error will occur.

bindings/rust/s2n-tls/src/fingerprint.rs

+2
Original file line numberDiff line numberDiff line change
@@ -17,12 +17,14 @@ use core::ptr::NonNull;
1717
#[derive(Copy, Clone)]
1818
pub enum FingerprintType {
1919
JA3,
20+
JA4,
2021
}
2122

2223
impl From<FingerprintType> for s2n_tls_sys::s2n_fingerprint_type::Type {
2324
fn from(value: FingerprintType) -> Self {
2425
match value {
2526
FingerprintType::JA3 => s2n_tls_sys::s2n_fingerprint_type::FINGERPRINT_JA3,
27+
FingerprintType::JA4 => s2n_tls_sys::s2n_fingerprint_type::FINGERPRINT_JA4,
2628
}
2729
}
2830
}
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,150 @@
1+
# JA4: TLS Client Fingerprinting
2+
3+
![JA4](https://github.com/FoxIO-LLC/ja4/blob/main/technical_details/JA4.png)
4+
5+
JA4 looks at the TLS Client Hello packet and builds a fingerprint of the client based on attributes within the packet.
6+
7+
### JA4 Algorithm:
8+
(QUIC=”q” or TCP=”t”)
9+
(2 character TLS version)
10+
(SNI=”d” or no SNI=”i”)
11+
(2 character count of ciphers)
12+
(2 character count of extensions)
13+
(first and last characters of first ALPN extension value)
14+
_
15+
(sha256 hash of the list of cipher hex codes sorted in hex order, truncated to 12 characters)
16+
_
17+
(sha256 hash of (the list of extension hex codes sorted in hex order)_(the list of signature algorithms), truncated to 12 characters)
18+
19+
The end result is a fingerprint that looks like:
20+
t13d1516h2_8daaf6152771_b186095e22b6
21+
22+
## Details:
23+
The program needs to ignore GREASE values anywhere it sees them: (https://datatracker.ietf.org/doc/html/draft-davidben-tls-grease-01#page-5)
24+
25+
### QUIC:
26+
https://en.wikipedia.org/wiki/QUIC
27+
“q” or “t”, which denotes whether the hello packet is for QUIC or TCP. QUIC is the protocol which the new HTTP/3 standard utilizes, encapsulating TLS 1.3 into UDP packets. As QUIC was developed by Google, if an organization heavily utilizes Google products, QUIC could make up half of their network traffic, so this is important to capture.
28+
29+
If the protocol is QUIC then the first character of the fingerprint is “q” if not, it’s “t”.
30+
31+
### TLS Version:
32+
TLS version is shown in 3 different places. If extension 0x002b exists (supported_versions), then the version is the highest value in the extension. Remember to ignore GREASE values. If the extension doesn’t exist, then the TLS version is the value of the Protocol Version. Handshake version (located at the top of the packet) should be ignored.
33+
34+
0x0304 = TLS 1.3 = “13”
35+
0x0303 = TLS 1.2 = “12”
36+
0x0302 = TLS 1.1 = “11”
37+
0x0301 = TLS 1.0 = “10”
38+
0x0300 = SSL 3.0 = “s3”
39+
0x0200 = SSL 2.0 = “s2”
40+
0x0100 = SSL 1.0 = “s1”
41+
42+
Unknown = “00”
43+
44+
### SNI:
45+
If the SNI extension (0x0000) exists, then the destination of the connection is a domain, or “d” in the fingerprint. If the SNI does not exist, then the destination is an IP address, or “i”.
46+
47+
### Number of Ciphers:
48+
2 character number of cipher suites, so if there’s 6 cipher suites in the hello packet, then the value should be “06”. If there’s > 99, which there should never be, then output “99”. Remember, ignore GREASE values. They don’t count.
49+
50+
### Number of Extensions:
51+
Same as counting ciphers. Ignore GREASE. Include SNI and ALPN.
52+
53+
### ALPN Extension Value:
54+
The first and last characters of the ALPN (Application-Layer Protocol Negotiation) first value.
55+
List of possible ALPN Values (scroll down): https://www.iana.org/assignments/tls-extensiontype-values/tls-extensiontype-values.xhtml
56+
57+
58+
59+
In the above example, the first ALPN value is h2 so the first and last characters to use in the fingerprint are “h2”. IF the first ALPN listed was http/1.1 then the first and last characters to use in the fingerprint would be “h1”.
60+
61+
In Wireshark this field is located under tls.handshake.extensions_alpn_str
62+
63+
If there are no ALPN values or no ALPN extension then we print “00” as the value in the fingerprint.
64+
65+
### Cipher hash:
66+
A 12 character truncated sha256 hash of the list of ciphers sorted in hex order, first 12 characters. The list is created using the 4 character hex values of the ciphers, lower case, comma delimited, ignoring GREASE.
67+
Example:
68+
```
69+
1301,1302,1303,c02b,c02f,c02c,c030,cca9,cca8,c013,c014,009c,009d,002f,0035
70+
```
71+
Is sorted to:
72+
```
73+
002f,0035,009c,009d,1301,1302,1303,c013,c014,c02b,c02c,c02f,c030,cca8,cca9 = 8daaf6152771
74+
```
75+
76+
### Extension hash:
77+
A 12 character truncated sha256 hash of the list of extensions, sorted by hex value, followed by the list of signature algorithms, in the order that they appear (not sorted).
78+
79+
The extension list is created using the 4 character hex values of the extensions, lower case, comma delimited, sorted (not in the order they appear). Ignore the SNI extension (0000) and the ALPN extension (0010) as we’ve already captured them in the _a_ section of the fingerprint. These values are omitted so that the same application would have the same _b_ section of the fingerprint regardless of if it were going to a domain, IP, or changing ALPNs.
80+
81+
For example:
82+
```
83+
001b,0000,0033,0010,4469,0017,002d,000d,0005,0023,0012,002b,ff01,000b,000a,0015
84+
```
85+
Is sorted to:
86+
```
87+
0005,000a,000b,000d,0012,0015,0017,001b,0023,002b,002d,0033,4469,ff01
88+
```
89+
(notice 0000 and 0010 is removed)
90+
91+
The signature algorithm hex values are then added to the end of the list in the order that they appear (not sorted) with an underscore delimiting the two lists.
92+
For example the signature algorithms:
93+
```
94+
0403,0804,0401,0503,0805,0501,0806,0601
95+
```
96+
Are added to the end of the previous string to create:
97+
```
98+
0005,000a,000b,000d,0012,0015,0017,001b,0023,002b,002d,0033,4469,ff01_0403,0804,0401,0503,0805,0501,0806,0601
99+
```
100+
Hashed to:
101+
```
102+
e5627efa2ab19723084c1033a96c694a45826ab5a460d2d3fd5ffcfe97161c95
103+
```
104+
Truncated to first 12 characters:
105+
```
106+
e5627efa2ab1
107+
```
108+
109+
If there are no signature algorithms in the hello packet, then the string ends without an underscore and is hashed.
110+
For example:
111+
```
112+
0005,000a,000b,000d,0012,0015,0017,001b,0023,002b,002d,0033,4469,ff01 = 6d807ffa2a79
113+
```
114+
115+
### Example
116+
117+
JA4 fingerprint:
118+
t (TLS over TCP)
119+
13 (TLS version 1.3)
120+
d (SNI exists so it’s going to a domain)
121+
15 (15 cipher suites ignoring grease)
122+
16 (16 extensions ignoring grease)
123+
h2 (first and last characters of the first ALPN extension value)
124+
_
125+
8daaf6152771 (truncated sha256 hash of the list of ciphers sorted)
126+
_
127+
e5627efa2ab1 (truncated sha256 hash of the list of extensions sorted, SNI and ALPN removed, followed by the list of signature algorithms)
128+
```
129+
JA4 = t13d1516h2_8daaf6152771_e5627efa2ab1
130+
```
131+
### Raw Output
132+
The program should allow for raw outputs either sorted or original.
133+
-r (raw fingerprint) -o (original)
134+
135+
The raw fingerprint for JA4 would look like this:
136+
```
137+
JA4_r = t13d1516h2_002f,0035,009c,009d,1301,1302,1303,c013,c014,c02b,c02c,c02f,c030,cca8,cca9_0005,000a,000b,000d,0012,0015,0017,001b,0023,002b,002d,0033,4469,ff01_0403,0804,0401,0503,0805,0501,0806,0601
138+
```
139+
140+
The "o" option includes the original values in the original order, less GREASE values. This means SNI (0000) and ALPN (0010) are included.
141+
142+
The raw fingerprint with the original ordering (-o) would look like this:
143+
```
144+
JA4_ro = t13d1516h2_1301,1302,1303,c02b,c02f,c02c,c030,cca9,cca8,c013,c014,009c,009d,002f,0035_001b,0000,0033,0010,4469,0017,002d,000d,0005,0023,0012,002b,ff01,000b,000a,0015_0403,0804,0401,0503,0805,0501,0806,0601
145+
```
146+
When ‘-o’ flag is specified, ‘ja4’ field must be renamed to ‘ja4_o’:
147+
```
148+
JA4_o = t13d1516h2_acb858a92679_18f69afefd3d
149+
```
150+
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,26 @@
1+
target = "https://raw.githubusercontent.com/FoxIO-LLC/ja4/v0.18.2/technical_details/JA4.md#alpn-extension-value"
2+
3+
# ### ALPN Extension Value:
4+
#
5+
# The first and last characters of the ALPN (Application-Layer Protocol Negotiation) first value.
6+
# List of possible ALPN Values (scroll down): https://www.iana.org/assignments/tls-extensiontype-values/tls-extensiontype-values.xhtml
7+
#
8+
#
9+
#
10+
# In the above example, the first ALPN value is h2 so the first and last characters to use in the fingerprint are “h2”. IF the first ALPN listed was http/1.1 then the first and last characters to use in the fingerprint would be “h1”.
11+
#
12+
# In Wireshark this field is located under tls.handshake.extensions_alpn_str
13+
#
14+
# If there are no ALPN values or no ALPN extension then we print “00” as the value in the fingerprint.
15+
#
16+
[[spec]]
17+
level = "MUST"
18+
quote = '''
19+
The first and last characters of the ALPN (Application-Layer Protocol Negotiation) first value.
20+
'''
21+
22+
[[spec]]
23+
level = "MUST"
24+
quote = '''
25+
If there are no ALPN values or no ALPN extension then we print “00” as the value in the fingerprint.
26+
'''
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,27 @@
1+
target = "https://raw.githubusercontent.com/FoxIO-LLC/ja4/v0.18.2/technical_details/JA4.md#cipher-hash"
2+
3+
# ### Cipher hash:
4+
#
5+
# A 12 character truncated sha256 hash of the list of ciphers sorted in hex order, first 12 characters. The list is created using the 4 character hex values of the ciphers, lower case, comma delimited, ignoring GREASE.
6+
# Example:
7+
# ```
8+
# 1301,1302,1303,c02b,c02f,c02c,c030,cca9,cca8,c013,c014,009c,009d,002f,0035
9+
# ```
10+
# Is sorted to:
11+
# ```
12+
# 002f,0035,009c,009d,1301,1302,1303,c013,c014,c02b,c02c,c02f,c030,cca8,cca9 = 8daaf6152771
13+
# ```
14+
#
15+
16+
[[spec]]
17+
level = "MUST"
18+
quote = '''
19+
A 12 character truncated sha256 hash of the list of ciphers sorted in hex order, first 12 characters.
20+
'''
21+
22+
[[spec]]
23+
level = "MUST"
24+
quote = '''
25+
The list is created using the 4 character hex values of the ciphers, lower case, comma delimited, ignoring GREASE.
26+
'''
27+
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,12 @@
1+
target = "https://raw.githubusercontent.com/FoxIO-LLC/ja4/v0.18.2/technical_details/JA4.md#details"
2+
3+
# ## Details:
4+
#
5+
# The program needs to ignore GREASE values anywhere it sees them: (https://datatracker.ietf.org/doc/html/draft-davidben-tls-grease-01#page-5)
6+
#
7+
8+
[[spec]]
9+
level = "MUST"
10+
quote = '''
11+
The program needs to ignore GREASE values anywhere it sees them
12+
'''
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,19 @@
1+
target = "https://raw.githubusercontent.com/FoxIO-LLC/ja4/v0.18.2/technical_details/JA4.md#example"
2+
3+
# ### Example
4+
#
5+
# JA4 fingerprint:
6+
# t (TLS over TCP)
7+
# 13 (TLS version 1.3)
8+
# d (SNI exists so it’s going to a domain)
9+
# 15 (15 cipher suites ignoring grease)
10+
# 16 (16 extensions ignoring grease)
11+
# h2 (first and last characters of the first ALPN extension value)
12+
# _
13+
# 8daaf6152771 (truncated sha256 hash of the list of ciphers sorted)
14+
# _
15+
# e5627efa2ab1 (truncated sha256 hash of the list of extensions sorted, SNI and ALPN removed, followed by the list of signature algorithms)
16+
# ```
17+
# JA4 = t13d1516h2_8daaf6152771_e5627efa2ab1
18+
# ```
19+
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,72 @@
1+
target = "https://raw.githubusercontent.com/FoxIO-LLC/ja4/v0.18.2/technical_details/JA4.md#extension-hash"
2+
3+
# ### Extension hash:
4+
#
5+
# A 12 character truncated sha256 hash of the list of extensions, sorted by hex value, followed by the list of signature algorithms, in the order that they appear (not sorted).
6+
#
7+
# The extension list is created using the 4 character hex values of the extensions, lower case, comma delimited, sorted (not in the order they appear). Ignore the SNI extension (0000) and the ALPN extension (0010) as we’ve already captured them in the _a_ section of the fingerprint. These values are omitted so that the same application would have the same _b_ section of the fingerprint regardless of if it were going to a domain, IP, or changing ALPNs.
8+
#
9+
# For example:
10+
# ```
11+
# 001b,0000,0033,0010,4469,0017,002d,000d,0005,0023,0012,002b,ff01,000b,000a,0015
12+
# ```
13+
# Is sorted to:
14+
# ```
15+
# 0005,000a,000b,000d,0012,0015,0017,001b,0023,002b,002d,0033,4469,ff01
16+
# ```
17+
# (notice 0000 and 0010 is removed)
18+
#
19+
# The signature algorithm hex values are then added to the end of the list in the order that they appear (not sorted) with an underscore delimiting the two lists.
20+
# For example the signature algorithms:
21+
# ```
22+
# 0403,0804,0401,0503,0805,0501,0806,0601
23+
# ```
24+
# Are added to the end of the previous string to create:
25+
# ```
26+
# 0005,000a,000b,000d,0012,0015,0017,001b,0023,002b,002d,0033,4469,ff01_0403,0804,0401,0503,0805,0501,0806,0601
27+
# ```
28+
# Hashed to:
29+
# ```
30+
# e5627efa2ab19723084c1033a96c694a45826ab5a460d2d3fd5ffcfe97161c95
31+
# ```
32+
# Truncated to first 12 characters:
33+
# ```
34+
# e5627efa2ab1
35+
# ```
36+
#
37+
# If there are no signature algorithms in the hello packet, then the string ends without an underscore and is hashed.
38+
# For example:
39+
# ```
40+
# 0005,000a,000b,000d,0012,0015,0017,001b,0023,002b,002d,0033,4469,ff01 = 6d807ffa2a79
41+
# ```
42+
#
43+
44+
[[spec]]
45+
level = "MUST"
46+
quote = '''
47+
A 12 character truncated sha256 hash of the list of extensions, sorted by hex value, followed by the list of signature algorithms, in the order that they appear (not sorted).
48+
'''
49+
50+
[[spec]]
51+
level = "MUST"
52+
quote = '''
53+
The extension list is created using the 4 character hex values of the extensions, lower case, comma delimited, sorted (not in the order they appear).
54+
'''
55+
56+
[[spec]]
57+
level = "MUST"
58+
quote = '''
59+
Ignore the SNI extension (0000) and the ALPN extension (0010) as we’ve already captured them in the _a_ section of the fingerprint.
60+
'''
61+
62+
[[spec]]
63+
level = "MUST"
64+
quote = '''
65+
The signature algorithm hex values are then added to the end of the list in the order that they appear (not sorted) with an underscore delimiting the two lists.
66+
'''
67+
68+
[[spec]]
69+
level = "MUST"
70+
quote = '''
71+
If there are no signature algorithms in the hello packet, then the string ends without an underscore and is hashed.
72+
'''
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,19 @@
1+
target = "https://raw.githubusercontent.com/FoxIO-LLC/ja4/v0.18.2/technical_details/JA4.md#ja4-algorithm"
2+
3+
# ### JA4 Algorithm:
4+
#
5+
# (QUIC=”q” or TCP=”t”)
6+
# (2 character TLS version)
7+
# (SNI=”d” or no SNI=”i”)
8+
# (2 character count of ciphers)
9+
# (2 character count of extensions)
10+
# (first and last characters of first ALPN extension value)
11+
# _
12+
# (sha256 hash of the list of cipher hex codes sorted in hex order, truncated to 12 characters)
13+
# _
14+
# (sha256 hash of (the list of extension hex codes sorted in hex order)_(the list of signature algorithms), truncated to 12 characters)
15+
#
16+
# The end result is a fingerprint that looks like:
17+
# t13d1516h2_8daaf6152771_b186095e22b6
18+
#
19+

0 commit comments

Comments
 (0)