INDEX
Explanations
No Explanations Found
New Auto-Interp
Negative Logits
EEP
-0.68
sonian
-0.67
eas
-0.66
FW
-0.65
RY
-0.65
Oath
-0.64
score
-0.62
Shall
-0.62
Ws
-0.61
vernment
-0.60
POSITIVE LOGITS
unfocused
0.70
ryu
0.67
accelerator
0.66
obscured
0.65
unarmed
0.64
Escape
0.61
Suzanne
0.60
igen
0.60
urion
0.59
nausea
0.57
Activations Density 0.000%
No Known Activations
This feature has no known activations.