INDEX
Explanations
mentions of a specific person's name - "Clinton"
mentions of the name "Clinton."
New Auto-Interp
Negative Logits
teenth
-0.84
GGGGGGGG
-0.73
DIS
-0.70
Gareth
-0.69
lass
-0.68
raints
-0.68
MAT
-0.67
orescent
-0.65
oya
-0.65
PDATE
-0.64
POSITIVE LOGITS
Clinton
0.97
Clinton
0.92
mia
0.88
Rodham
0.87
INTON
0.84
impeachment
0.84
istas
0.82
ite
0.82
clinton
0.81
herself
0.80
Activations Density 0.037%