INDEX
Explanations
names of individuals
proper nouns, particularly names of individuals
New Auto-Interp
Negative Logits
roach
-0.86
aband
-0.84
istani
-0.82
asta
-0.82
oodle
-0.81
Asia
-0.79
mith
-0.79
hiba
-0.75
adesh
-0.75
inki
-0.75
POSITIVE LOGITS
dinand
1.04
Ernst
0.78
Griffith
0.74
Cly
0.73
Claude
0.70
Ferdinand
0.70
Fleming
0.68
Cyr
0.67
Maurice
0.67
Petr
0.66
Activations Density 0.013%