INDEX
Explanations
complex terms related to political and social conditions
New Auto-Interp
Negative Logits
lal
-0.18
_mC
-0.17
ebo
-0.16
_mB
-0.16
_mE
-0.15
Chung
-0.15
aclass
-0.14
roje
-0.14
lse
-0.14
ãĥ
-0.14
POSITIVE LOGITS
uras
0.14
uri
0.14
ãĥ«ãĤ¯
0.14
equally
0.14
Bosch
0.14
bie
0.14
radi
0.13
eries
0.13
leading
0.13
imum
0.13
Activations Density 0.164%