INDEX
Explanations
proper nouns, specifically names related to people and characters
New Auto-Interp
Negative Logits
Shade
-0.16
ixin
-0.15
331
-0.15
ãĥªãĤ¹
-0.14
bris
-0.14
shade
-0.14
Ñıб
-0.14
back
-0.14
illus
-0.13
ngr
-0.13
POSITIVE LOGITS
inke
0.17
dü
0.15
ald
0.15
ãĥ³ãĤ¯
0.15
asz
0.14
ุà¸ĩ
0.14
Ïĥε
0.14
yk
0.14
isci
0.14
atrix
0.13
Activations Density 0.028%