INDEX
Explanations
names of individuals
names of individuals mentioned in the text
New Auto-Interp
Negative Logits
atos
-0.97
een
-0.86
ator
-0.86
alion
-0.83
irth
-0.82
assian
-0.81
sworth
-0.77
pipe
-0.77
stuff
-0.77
hip
-0.74
POSITIVE LOGITS
Wasserman
0.99
Fey
0.84
Schmidt
0.77
Debbie
0.75
Jarrett
0.73
xtap
0.73
Diane
0.72
Kelley
0.71
DeVos
0.69
Pelosi
0.69
Activations Density 0.006%