INDEX
Explanations
mentions of politicians and governments, particularly prime ministers and political parties
mentions of specific times in the format of PM
New Auto-Interp
Negative Logits
keyes
-0.84
¯
-0.77
ãĥ³ãĤ¸
-0.75
Flavoring
-0.74
keye
-0.74
itals
-0.71
tyard
-0.69
ãĤ¨ãĥ«
-0.67
come
-0.66
Guinness
-0.66
POSITIVE LOGITS
PM
1.42
essage
1.10
ueller
1.08
PM
0.88
PLIC
0.84
mosqu
0.84
ODE
0.83
PS
0.82
endment
0.82
PD
0.81
Activations Density 0.003%