INDEX
Explanations
No Explanations Found
New Auto-Interp
Negative Logits
mand
-0.79
recomm
-0.74
Theft
-0.71
imore
-0.69
delinqu
-0.63
anyl
-0.63
plurality
-0.63
pard
-0.62
Prohibition
-0.62
complying
-0.62
POSITIVE LOGITS
aldo
0.82
anto
0.72
ItemTracker
0.71
Krugman
0.70
oker
0.68
icio
0.67
wagon
0.67
vp
0.67
okers
0.66
ivia
0.65
Activations Density 0.000%
No Known Activations
This feature has no known activations.