INDEX
Explanations
references to policy changes and legal actions related to social issues
New Auto-Interp
Negative Logits
æĺ¯ä¸Ģ个
-0.20
æīĢæľī
-0.18
çļĦä¸Ģ个
-0.18
모ëĵł
-0.16
sebuah
-0.15
twice
-0.15
vešker
-0.15
another
-0.14
tüm
-0.14
ä»»ä½ķ
-0.14
POSITIVE LOGITS
either
0.55
either
0.45
Either
0.43
Either
0.39
various
0.39
varying
0.38
либо
0.38
EITHER
0.37
respectively
0.36
respective
0.34
Activations Density 0.648%