INDEX
Explanations
names of individuals related to controversies or significant events
New Auto-Interp
Negative Logits
ument
-0.96
urat
-0.92
nces
-0.87
gments
-0.84
seys
-0.84
itsu
-0.83
gers
-0.82
ctive
-0.82
erald
-0.81
bly
-0.81
POSITIVE LOGITS
Allan
0.95
Giles
0.86
Blackwell
0.83
Cullen
0.82
Rebell
0.80
Lowell
0.77
Baird
0.76
Burgess
0.76
Holden
0.73
Vaugh
0.73
Activations Density 0.017%