INDEX
Explanations
verbs indicating progress or advancement
New Auto-Interp
Negative Logits
oppel
-0.17
OND
-0.15
umn
-0.15
backward
-0.14
олÑĮно
-0.14
onal
-0.14
fy
-0.14
iden
-0.14
rador
-0.14
bol
-0.13
POSITIVE LOGITS
bracht
0.15
ularity
0.15
Mahon
0.14
-away
0.14
غ
0.14
ovan
0.14
able
0.14
ÑģÑĤÑĢ
0.13
orch
0.13
illas
0.13
Activations Density 0.056%