INDEX
Explanations
mentions of Russia and related entities
New Auto-Interp
Negative Logits
ãģ²
-0.16
ifr
-0.16
bos
-0.15
iswa
-0.15
asco
-0.15
orns
-0.14
TER
-0.14
ibase
-0.14
ivor
-0.14
-long
-0.13
POSITIVE LOGITS
Federation
0.29
Roulette
0.23
ìĭľìķĦ
0.20
ç½Ĺæĸ¯
0.20
roulette
0.18
Фед
0.18
Dmit
0.17
Fed
0.17
Feder
0.16
Fed
0.16
Activations Density 0.023%