INDEX
Explanations
references to a specific political figure
mentions of the name "Carter."
New Auto-Interp
Negative Logits
enance
-0.83
ged
-0.75
gement
-0.74
inals
-0.73
ortmund
-0.72
atical
-0.67
ocard
-0.66
ging
-0.64
aries
-0.63
udeb
-0.63
POSITIVE LOGITS
Carter
0.82
bilt
0.79
lee
0.75
Carter
0.75
pan
0.74
waves
0.74
bent
0.74
isle
0.72
craft
0.71
Solo
0.70
Activations Density 0.020%