INDEX
Explanations
key terms related to societal and political issues
New Auto-Interp
Negative Logits
antine
-0.16
yster
-0.15
enger
-0.14
newVal
-0.14
leitung
-0.14
ogle
-0.14
iso
-0.13
ëĮĢë¡ľ
-0.13
žel
-0.13
igidBody
-0.13
POSITIVE LOGITS
çļĦæĺ¯
0.36
are
0.31
_are
0.21
adalah
0.19
ÙĩستÙĨد
0.19
ãģ®ãģ¯
0.18
were
0.18
are
0.18
ÑıвлÑıÑİÑĤÑģÑı
0.18
ï¼īãģ¯
0.18
Activations Density 0.198%