INDEX
Explanations
proper nouns and specific names related to individuals and places
New Auto-Interp
Negative Logits
uckle
-0.18
anking
-0.16
tics
-0.15
incare
-0.15
cls
-0.15
uples
-0.15
iscard
-0.15
igger
-0.15
uing
-0.15
pio
-0.15
POSITIVE LOGITS
iyat
0.28
awi
0.26
iyah
0.24
arat
0.24
qa
0.24
heed
0.23
leh
0.23
ayah
0.23
noon
0.22
noun
0.22
Activations Density 0.129%