INDEX
Explanations
mentions of organizations or groups
New Auto-Interp
Negative Logits
but
-0.22
but
-0.20
thereby
-0.18
èĢĮ
-0.18
çünkü
-0.17
olut
-0.16
ãĥ£
-0.16
но
-0.16
takže
-0.16
åIJ¦
-0.16
POSITIVE LOGITS
like
0.30
unlike
0.29
along
0.25
along
0.22
contrary
0.21
however
0.20
åīĩ
0.20
despite
0.20
meanwhile
0.20
together
0.20
Activations Density 0.214%