INDEX
Explanations
high-frequency function words and pronouns
New Auto-Interp
Negative Logits
altung
-0.15
quina
-0.15
iol
-0.15
quier
-0.15
eskort
-0.15
Keys
-0.15
iov
-0.14
ÎķÎł
-0.14
ije
-0.14
loud
-0.14
POSITIVE LOGITS
Nicholson
0.18
acter
0.15
ynam
0.15
pliers
0.15
ekil
0.15
DL
0.15
пÑĢедел
0.15
Heller
0.14
æģ
0.14
umba
0.14
Activations Density 0.006%