INDEX
Explanations
linguistic variations and transformations of verbs and their grammatical forms
New Auto-Interp
Negative Logits
stellung
-0.23
saw
-0.17
ate
-0.17
werk
-0.17
gave
-0.17
drew
-0.16
Nose
-0.16
ernal
-0.15
itou
-0.15
took
-0.15
POSITIVE LOGITS
ichtet
0.27
unden
0.22
etzt
0.21
ählt
0.20
zeichnet
0.20
ellt
0.19
agt
0.18
nom
0.18
issen
0.18
legt
0.18
Activations Density 0.019%