INDEX
Explanations
references to opposition or dissent in political contexts
New Auto-Interp
Negative Logits
igans
-0.16
ulner
-0.15
pent
-0.15
.Constant
-0.15
Rit
-0.15
sublicense
-0.14
ingle
-0.14
\<
-0.14
gere
-0.14
лÑĸв
-0.14
POSITIVE LOGITS
andalone
0.15
جاد
0.14
éŃļ
0.14
eyse
0.14
EventManager
0.14
uger
0.13
"]."
0.13
ãĥ³ãĥģ
0.13
jadx
0.13
ãİ
0.13
Activations Density 0.012%