INDEX
Explanations
locations mentioned in the context of protests or significant public events
New Auto-Interp
Negative Logits
Albania
-0.15
dikke
-0.15
Croatian
-0.15
ÑģÑĤи
-0.14
Nicar
-0.14
Yugoslavia
-0.14
Slovak
-0.14
ãĥ¥ãĥ¼
-0.14
odia
-0.14
olia
-0.14
POSITIVE LOGITS
Paris
0.44
London
0.41
Paris
0.39
London
0.38
paris
0.35
Rome
0.35
london
0.32
Berlin
0.30
Tokyo
0.29
Milan
0.29
Activations Density 0.315%