INDEX
Explanations
No Explanations Found
New Auto-Interp
Negative Logits
glers
-0.74
comed
-0.67
Ulster
-0.64
Vine
-0.64
aceous
-0.62
Polic
-0.62
Soldiers
-0.61
Pv
-0.61
Gorsuch
-0.61
Sorce
-0.60
POSITIVE LOGITS
athy
0.87
unts
0.74
anton
0.71
independence
0.71
kus
0.71
ngth
0.70
ector
0.69
xon
0.69
fman
0.68
eyes
0.67
Activations Density 0.000%
No Known Activations
This feature has no known activations.