INDEX
Explanations
words and suffixes related to concepts of knowledge, awareness, and existence
New Auto-Interp
Negative Logits
s
-0.19
bers
-0.19
leta
-0.18
head
-0.18
ses
-0.17
ÏĤ
-0.17
ŀ
-0.16
heads
-0.16
ctor
-0.16
Ùĩ
-0.16
POSITIVE LOGITS
emente
0.32
iation
0.29
iated
0.28
ials
0.23
itious
0.20
ech
0.20
zia
0.19
ennial
0.19
alist
0.19
ia
0.19
Activations Density 0.235%