INDEX
Explanations
possessive forms and expressions of ownership
New Auto-Interp
Negative Logits
ën
-0.21
ster
-0.17
’na
-0.17
’d
-0.15
ular
-0.15
liness
-0.15
’s
-0.15
ish
-0.15
sher
-0.15
em
-0.14
POSITIVE LOGITS
Choice
0.22
aurus
0.21
-eye
0.20
sah
0.18
Eve
0.18
s
0.18
ÂĢÂĻ
0.17
chaft
0.17
'-
0.17
Choice
0.17
Activations Density 0.116%