INDEX
Explanations
No Explanations Found
New Auto-Interp
Negative Logits
Byrne
-0.75
IZ
-0.66
rall
-0.64
Modes
-0.63
Spielberg
-0.61
Trave
-0.60
inse
-0.59
COUR
-0.59
undecided
-0.57
independ
-0.57
POSITIVE LOGITS
arer
0.76
SourceFile
0.76
DonaldTrump
0.73
resents
0.72
ifer
0.72
hus
0.71
Enlarge
0.69
cause
0.68
âĵĺ
0.66
arers
0.65
Activations Density 0.000%
No Known Activations
This feature has no known activations.