INDEX
Explanations
names of individuals related to politics and news headlines
mentions of specific individuals, particularly in relation to statements or reports
New Auto-Interp
Negative Logits
deaf
-0.70
orian
-0.67
Lex
-0.67
Mart
-0.66
Occupations
-0.66
essors
-0.66
teenth
-0.65
Wars
-0.65
FUL
-0.63
laus
-0.63
POSITIVE LOGITS
Weiner
1.12
Abedin
0.96
aukee
0.87
ufact
0.74
arty
0.74
needles
0.73
ullivan
0.72
ãĤ¦ãĤ¹
0.71
agog
0.70
combe
0.69
Activations Density 0.006%