INDEX
Explanations
the name "Clinton" and its variations throughout the text
New Auto-Interp
Negative Logits
teenth
-0.81
GGGGGGGG
-0.72
raints
-0.71
Gareth
-0.68
CAST
-0.67
lass
-0.66
aic
-0.66
DIS
-0.66
PDATE
-0.66
eatures
-0.66
POSITIVE LOGITS
Clinton
0.99
Clinton
0.96
Rodham
0.90
clinton
0.88
herself
0.87
mia
0.87
impeachment
0.86
INTON
0.84
ite
0.83
istas
0.83
Activations Density 0.023%