INDEX
Explanations
comparative adjectives that indicate size or degree of something
New Auto-Interp
Negative Logits
y
-0.23
son
-0.19
screen
-0.17
ert
-0.17
eln
-0.17
ertype
-0.17
sing
-0.17
set
-0.17
ikel
-0.17
sc
-0.16
POSITIVE LOGITS
-than
0.57
than
0.44
than
0.43
_than
0.42
THAN
0.34
Than
0.30
Than
0.30
než
0.26
вÑģего
0.25
niż
0.24
Activations Density 0.160%