INDEX
Explanations
possessive forms and indicators of ownership
New Auto-Interp
Negative Logits
619
-0.21
issen
-0.15
uilder
-0.14
Wage
-0.14
ETHER
-0.14
712
-0.13
onne
-0.13
atoria
-0.13
célib
-0.13
poons
-0.13
POSITIVE LOGITS
ability
0.19
efforts
0.17
role
0.16
attempt
0.16
attempts
0.15
гÑĥ
0.15
use
0.15
n
0.15
ÑģпоÑģоб
0.15
usc
0.15
Activations Density 0.205%