INDEX
Explanations
words related to political parties
repeated mentions of specific entities and names relevant to a narrative or context
New Auto-Interp
Negative Logits
ulent
-0.71
overfl
-0.68
rawling
-0.68
perme
-0.66
wave
-0.66
Indies
-0.64
Petraeus
-0.64
Lions
-0.63
owed
-0.63
Leilan
-0.63
POSITIVE LOGITS
aday
0.92
EMENT
0.90
etime
0.90
CRIP
0.88
cius
0.81
seys
0.81
atell
0.81
ignt
0.80
eties
0.79
pring
0.79
Activations Density 0.014%