INDEX
Explanations
articles and possessive pronouns related to individuals
New Auto-Interp
Negative Logits
cia
-0.16
ema
-0.15
etsk
-0.15
peare
-0.15
anki
-0.15
ạc
-0.14
iator
-0.14
entin
-0.14
دار
-0.14
aker
-0.13
POSITIVE LOGITS
issen
0.16
ingle
0.15
likes
0.14
ayet
0.14
ække
0.14
bidden
0.14
803
0.14
-Origin
0.14
stime
0.14
ikler
0.14
Activations Density 0.012%