INDEX
Explanations
No Explanations Found
New Auto-Interp
Negative Logits
Seym
-0.75
mathemat
-0.68
handshake
-0.67
Split
-0.66
wors
-0.66
poppy
-0.65
portrayal
-0.64
scapego
-0.63
bond
-0.62
drib
-0.61
POSITIVE LOGITS
abeth
0.72
ographers
0.71
Whitman
0.69
ographer
0.69
tml
0.66
vana
0.65
leukemia
0.65
^^^^
0.63
aspx
0.63
^^
0.63
Activations Density 0.000%
No Known Activations
This feature has no known activations.