INDEX
Explanations
specific numerical values and related qualifiers in text
New Auto-Interp
Negative Logits
yms
-0.16
åħ
-0.16
forman
-0.15
[port
-0.15
agged
-0.14
disarm
-0.14
uben
-0.14
İY
-0.14
pty
-0.14
ãĥ«ãĤ¯
-0.14
POSITIVE LOGITS
dick
0.18
çek
0.17
ваÑĢ
0.15
lip
0.15
Lip
0.15
blade
0.15
belt
0.15
bur
0.15
Brick
0.15
strip
0.14
Activations Density 0.025%