INDEX
Explanations
names of political figures and countries
instances of immediate actions or requests within political contexts
New Auto-Interp
Negative Logits
omaly
-0.70
lineage
-0.68
Occ
-0.64
apter
-0.63
entropy
-0.63
Exper
-0.62
manuscript
-0.62
uyomi
-0.59
Wonderful
-0.59
TOR
-0.58
POSITIVE LOGITS
applauded
1.15
congratulated
1.14
condemned
1.07
criticised
1.05
tweeted
1.04
condol
1.03
praised
1.00
denounced
1.00
reacted
1.00
urged
1.00
Activations Density 0.652%