INDEX
Explanations
references to individuals or entities involved in reporting or academia
New Auto-Interp
Negative Logits
apus
-0.16
argent
-0.16
yna
-0.15
ãĥ©ãĤ¹
-0.15
avaÅŁ
-0.15
arius
-0.15
buá»ķi
-0.14
ÙĨاء
-0.14
.volley
-0.14
æ¸Ī
-0.14
POSITIVE LOGITS
Democracy
0.18
utt
0.17
orts
0.15
AMY
0.15
avn
0.14
democracy
0.14
won
0.14
.openg
0.14
BarItem
0.14
347
0.14
Activations Density 0.003%