INDEX
Explanations
proper nouns, especially names of people
proper nouns, particularly names of people and notable identifiers
New Auto-Interp
Negative Logits
ength
-0.85
é¾į
-0.81
psons
-0.80
oled
-0.79
ously
-0.79
oured
-0.78
ize
-0.78
opol
-0.77
saf
-0.76
osal
-0.75
POSITIVE LOGITS
Bie
1.34
Tuc
0.88
Dele
0.77
kson
0.77
Wyatt
0.76
braska
0.75
CU
0.75
Gunn
0.74
Akin
0.74
Reeves
0.73
Activations Density 0.018%