INDEX
Explanations
No Explanations Found
New Auto-Interp
Negative Logits
CLS
-0.74
Pwr
-0.70
Forth
-0.66
farming
-0.65
sol
-0.64
migration
-0.64
MAP
-0.63
Accounting
-0.63
revival
-0.63
fortunes
-0.62
POSITIVE LOGITS
fight
0.84
iens
0.81
oug
0.79
î
0.76
verbal
0.76
ividually
0.76
iott
0.75
cats
0.74
ittens
0.74
words
0.74
Activations Density 0.000%
No Known Activations
This feature has no known activations.