INDEX
Explanations
specific phrases indicating conditions or experiences related to evaluating or critiquing items or events
New Auto-Interp
Negative Logits
tweeted
-0.39
đương
-0.37
utr
-0.35
pierw
-0.35
ancier
-0.34
bunt
-0.34
чают
-0.33
تقاوى
-0.33
annia
-0.32
cited
-0.32
POSITIVE LOGITS
otherwise
2.15
Otherwise
2.07
Otherwise
2.07
otherwise
1.95
OTHERWISE
1.88
ellers
1.40
sonst
1.37
jinak
1.32
altrimenti
1.29
Else
1.28
Activations Density 0.327%