INDEX
Explanations
possessive forms of words, particularly indicating ownership or belonging
New Auto-Interp
Negative Logits
Finger
-0.16
finger
-0.15
gens
-0.15
HAM
-0.14
chter
-0.14
Overall
-0.14
ffset
-0.14
osta
-0.14
uch
-0.14
ince
-0.13
POSITIVE LOGITS
reon
0.17
ãĥ«ãĥķ
0.16
룬
0.15
Reject
0.15
rang
0.14
arth
0.14
Sto
0.14
Druh
0.13
obel
0.13
oble
0.13
Activations Density 0.037%