INDEX
Explanations
code-like identifiers and technical jargon strings, especially mixed-case/camelCase or concatenated tokens.
New Auto-Interp
Negative Logits
0
-0.10
1
-0.10
5
-0.10
3
-0.10
4
-0.09
7
-0.09
9
-0.09
2
-0.09
13
-0.08
19
-0.08
POSITIVE LOGITS
Birch
0.08
โล
0.07
odu
0.07
usta
0.07
Schwarz
0.07
enberg
0.07
uong
0.07
nore
0.07
Royale
0.07
staw
0.07
Activations Density 4.068%