INDEX
Explanations
No Explanations Found
New Auto-Interp
Negative Logits
failed
-0.66
DEM
-0.66
nominate
-0.63
RY
-0.63
dev
-0.62
Kon
-0.61
Rah
-0.60
Hud
-0.60
Krypt
-0.60
aj
-0.59
POSITIVE LOGITS
acial
0.79
isure
0.74
passports
0.72
ivory
0.71
entimes
0.69
ople
0.68
thren
0.67
isine
0.67
elman
0.66
ertodd
0.66
Activations Density 0.000%
No Known Activations
This feature has no known activations.