INDEX
Explanations
names of political figures
mentions of specific public figures and their actions or attributes
New Auto-Interp
Negative Logits
resso
-0.65
Archdemon
-0.65
Directions
-0.64
isu
-0.64
repositories
-0.64
Puzzle
-0.64
apo
-0.62
IRE
-0.61
sett
-0.60
rall
-0.59
POSITIVE LOGITS
gyn
1.47
ancel
1.01
uala
0.87
Huckabee
0.86
atur
0.81
Kelly
0.79
oad
0.77
ously
0.77
iliate
0.74
Emin
0.74
Activations Density 0.005%