INDEX
Explanations
references to significant socio-political events or government-related activities
New Auto-Interp
Negative Logits
owell
-0.07
ulan
-0.07
ibel
-0.06
Nicar
-0.06
atz
-0.06
Formal
-0.06
rove
-0.06
yan
-0.06
ãģ¯ãģļ
-0.06
åѦä¼ļ
-0.06
POSITIVE LOGITS
Washington
0.12
London
0.11
Washington
0.11
London
0.10
New
0.09
Paris
0.08
Moscow
0.08
Washing
0.08
Los
0.08
Tokyo
0.08
Activations Density 0.005%