INDEX
Explanations
No Explanations Found
New Auto-Interp
Negative Logits
Glob
-0.75
Theft
-0.73
loader
-0.69
Statements
-0.68
pict
-0.68
lasses
-0.68
Component
-0.67
cycl
-0.67
Airl
-0.65
odor
-0.65
POSITIVE LOGITS
ufact
0.75
disse
0.70
resid
0.66
coerc
0.65
reditary
0.65
qqa
0.63
ascus
0.63
trou
0.62
metic
0.62
sued
0.60
Activations Density 0.000%
No Known Activations
This feature has no known activations.