INDEX
Explanations
past participles of verbs
New Auto-Interp
Negative Logits
i
-0.22
asted
-0.17
lek
-0.16
ież
-0.16
s
-0.15
marvin
-0.15
adows
-0.15
incident
-0.15
ologne
-0.15
olvers
-0.15
POSITIVE LOGITS
dy
0.26
gy
0.25
ifice
0.24
icts
0.22
d
0.21
die
0.21
ema
0.21
ging
0.21
uction
0.20
dress
0.20
Activations Density 0.007%