INDEX
Explanations
references to political ideologies and their implications in society
New Auto-Interp
Negative Logits
į¼
-0.13
/API
-0.13
segue
-0.13
aste
-0.12
渡
-0.12
ABCDEFG
-0.12
ŃIJ
-0.12
omal
-0.12
İĺìĿ´
-0.12
еле
-0.12
POSITIVE LOGITS
in
0.51
åľ¨
0.34
în
0.34
à¹ĥà¸Ļ
0.31
ÙģÙĬ
0.30
åľ¨
0.29
InThe
0.26
در
0.24
In
0.24
ÙģÙī
0.23
Activations Density 0.144%