INDEX
Explanations
proper nouns, particularly names and related entities
names of individuals, particularly focusing on the last names starting with the letter "D"
New Auto-Interp
Negative Logits
DragonMagazine
-0.75
jri
-0.73
commissions
-0.67
querade
-0.66
agall
-0.65
sher
-0.64
icip
-0.64
cess
-0.64
é¾įå¥ij士
-0.61
Mehran
-0.61
POSITIVE LOGITS
ãĥķãĤ©
0.76
ãĤ¶
0.72
ãĥ¼ãĥĨ
0.71
ãĤ¿
0.70
ãĥ¡
0.68
ãĤ¨ãĥ«
0.67
ãĥ¼ãĥ
0.66
ãĥĺ
0.66
atz
0.65
ãĤ«
0.65
Activations Density 0.406%