INDEX
Explanations
phrases indicating negative sentiment towards authority or systemic issues
New Auto-Interp
Negative Logits
ureau
-0.16
Ø´ÙħاÙĦÛĮ
-0.15
ÙĪØ§Ø±
-0.15
ahead
-0.15
trừ
-0.14
icc
-0.14
icers
-0.14
ĮĢ
-0.14
205
-0.13
iola
-0.13
POSITIVE LOGITS
mass
0.24
masses
0.23
Massive
0.21
reaching
0.20
mass
0.20
Reach
0.20
reach
0.20
Mass
0.20
reach
0.19
massive
0.19
Activations Density 0.027%