INDEX
Explanations
No Explanations Found
New Auto-Interp
Negative Logits
Soccer
-0.72
Eleven
-0.71
Prot
-0.70
Quart
-0.68
Dak
-0.66
pole
-0.66
Ging
-0.65
Tanks
-0.63
Chall
-0.63
¥ŀ
-0.63
POSITIVE LOGITS
irez
0.99
effects
0.76
FIELD
0.71
onyms
0.69
ierrez
0.67
ause
0.65
ourgeois
0.64
wd
0.64
insky
0.64
entially
0.64
Activations Density 0.000%
No Known Activations
This feature has no known activations.