INDEX
Explanations
comparative language and phrases denoting relationships between quantities or qualities
New Auto-Interp
Negative Logits
Мексичка
-1.30
propOrder
-1.07
OGND
-1.04
SequentialGroup
-1.01
يتيمه
-0.95
ftagPool
-0.92
pinulongan
-0.90
abestanden
-0.89
виправивши
-0.88
ModelExpression
-0.87
POSITIVE LOGITS
.
0.63
se
0.54
in
0.53
0.51
.
0.50
di
0.49
:
0.47
…
0.47
"
0.47
↵↵
0.47
Activations Density 0.587%