INDEX
Explanations
infinitive verbs, particularly those indicating necessity or obligation
New Auto-Interp
Negative Logits
uard
-0.17
usta
-0.17
rift
-0.16
uale
-0.15
ÛĮÙģ
-0.15
ught
-0.14
CONTR
-0.14
erson
-0.14
Ľå»º
-0.14
chedulers
-0.14
POSITIVE LOGITS
bite
0.16
815
0.16
ums
0.14
akes
0.14
utsch
0.14
lem
0.14
bitten
0.14
och
0.14
ro
0.13
agy
0.13
Activations Density 0.029%