INDEX
Explanations
No Explanations Found
New Auto-Interp
Negative Logits
emin
-0.69
lin
-0.66
bia
-0.65
chn
-0.64
grading
-0.64
asy
-0.63
unin
-0.63
Murray
-0.63
held
-0.62
assi
-0.62
POSITIVE LOGITS
FU
0.71
pei
0.70
Caption
0.68
embodiment
0.67
Detective
0.65
smugg
0.63
Murd
0.60
Guy
0.60
mang
0.60
Constable
0.59
Activations Density 0.000%
No Known Activations
This feature has no known activations.