INDEX
Explanations
No Explanations Found
New Auto-Interp
Negative Logits
agar
-0.84
boss
-0.76
sburg
-0.73
hold
-0.72
rog
-0.69
oken
-0.67
ached
-0.67
ell
-0.67
gg
-0.66
aff
-0.66
POSITIVE LOGITS
terday
0.71
cryst
0.70
scrut
0.70
foss
0.69
incent
0.68
livest
0.68
rpm
0.67
oult
0.67
ultr
0.66
Sask
0.66
Activations Density 0.000%
No Known Activations
This feature has no known activations.