INDEX
Explanations
mentions of Russia and its related entities
New Auto-Interp
Negative Logits
ERC
-0.17
ifr
-0.15
bos
-0.15
ivor
-0.15
arians
-0.14
-Encoding
-0.14
ponce
-0.14
asley
-0.14
QM
-0.14
Nam
-0.14
POSITIVE LOGITS
Federation
0.36
Roulette
0.24
federation
0.23
ìĭľìķĦ
0.21
ç½Ĺæĸ¯
0.20
Feder
0.19
Фед
0.18
bear
0.18
roulette
0.18
Fed
0.18
Activations Density 0.023%