INDEX
Explanations
reported statements and claims made by individuals or organizations
New Auto-Interp
Negative Logits
ourselves
-0.17
According
-0.14
æĪij们çļĦ
-0.14
sip
-0.14
according
-0.14
according
-0.13
unseren
-0.13
нами
-0.13
æĪij们
-0.13
.manager
-0.13
POSITIVE LOGITS
any
0.22
there
0.22
:
0.21
it
0.20
none
0.19
although
0.19
while
0.18
Tuesday
0.18
nobody
0.18
Monday
0.17
Activations Density 0.188%