INDEX
Explanations
Assertions or claims about societal issues and controversies
even with comparatives
New Auto-Interp
Negative Logits
verwijzen
-0.45
Aholisi
-0.42
andererseits
-0.40
հղումներ
-0.39
åter
-0.38
dunque
-0.38
beforeAll
-0.37
فريبيس
-0.36
deka
-0.36
тельстве
-0.35
POSITIVE LOGITS
sogar
0.49
even
0.49
even
0.44
hatta
0.44
még
0.44
даже
0.42
lagi
0.41
additional
0.40
additional
0.40
EDEFAULT
0.39
Activations Density 0.083%