INDEX
Explanations
variations of certain verbs in different grammatical forms
New Auto-Interp
Negative Logits
ourt
-0.16
eyle
-0.16
apia
-0.15
eer
-0.15
Pere
-0.14
ร
-0.14
ÙijÙIJ
-0.14
c
-0.13
lse
-0.13
arget
-0.13
POSITIVE LOGITS
chen
0.23
zsche
0.18
legg
0.17
лим
0.16
lein
0.15
ög
0.15
liche
0.15
cheng
0.15
addock
0.15
-mf
0.15
Activations Density 0.074%