INDEX
Explanations
possessive pronouns indicating ownership or association
New Auto-Interp
Negative Logits
mun
-0.22
подÑģ
-0.16
remium
-0.16
stup
-0.15
Fever
-0.15
paralleled
-0.15
.Interop
-0.14
quelle
-0.14
brane
-0.14
kla
-0.14
POSITIVE LOGITS
share
0.28
sights
0.24
hands
0.23
bearings
0.22
act
0.22
fingers
0.22
own
0.21
fill
0.21
Bearings
0.20
ducks
0.20
Activations Density 0.239%