INDEX
Explanations
No Explanations Found
New Auto-Interp
Negative Logits
iott
-0.71
namese
-0.71
vati
-0.69
resid
-0.68
footing
-0.67
prosecut
-0.66
hemor
-0.66
planner
-0.63
impulse
-0.61
disproportion
-0.61
POSITIVE LOGITS
Amateur
0.76
uther
0.75
plet
0.74
idding
0.72
atel
0.69
utical
0.68
udos
0.68
olicited
0.68
ysical
0.67
reens
0.66
Activations Density 0.000%
No Known Activations
This feature has no known activations.