INDEX
Explanations
names of individuals
proper nouns or names, particularly those with specific phonetic patterns
New Auto-Interp
Negative Logits
sideways
-0.67
wom
-0.64
cul
-0.63
quarters
-0.62
abouts
-0.62
losers
-0.61
demol
-0.60
wells
-0.60
cred
-0.60
grounds
-0.59
POSITIVE LOGITS
ciating
1.19
ement
0.92
rand
0.79
enment
0.78
ued
0.78
OVA
0.77
uese
0.76
lement
0.74
tion
0.72
alia
0.72
Activations Density 0.173%