INDEX
Explanations
mentions of individuals or entities involved in conflicts, controversies, or notable events
New Auto-Interp
Negative Logits
عÙĨÙĩ
-0.15
coni
-0.14
inand
-0.14
ibur
-0.14
èĬ¸
-0.14
iddet
-0.14
resi
-0.14
reative
-0.14
å¹¹ç·ļ
-0.14
inne
-0.13
POSITIVE LOGITS
of
0.55
cá»§a
0.37
of
0.34
Of
0.29
.of
0.28
of
0.26
à¸ĩà¸Ĥà¸Ńà¸ĩ
0.25
_of
0.25
à¸Ĥà¸Ńà¸ĩ
0.25
á»§a
0.23
Activations Density 0.017%