INDEX
Explanations
expressions of opposition or resistance to proposals and actions
New Auto-Interp
Negative Logits
İÅŀ
-0.15
oader
-0.15
onth
-0.15
charset
-0.15
é¦Ļ
-0.15
olec
-0.15
ehler
-0.14
kip
-0.14
çek
-0.14
nila
-0.14
POSITIVE LOGITS
ors
0.16
iani
0.15
mers
0.15
rin
0.15
Attempts
0.14
arsed
0.14
زÛĮÙĨÙĩ
0.14
attempts
0.14
ive
0.14
515
0.13
Activations Density 0.117%