INDEX
Explanations
proper nouns related to people's names
the repeated occurrence of specific patterns or fragments in names and terms related to individuals or entities
New Auto-Interp
Negative Logits
footed
-0.64
dracon
-0.56
nces
-0.53
Reviewer
-0.53
magnitude
-0.52
ctors
-0.52
unker
-0.52
pic
-0.51
urized
-0.51
thumb
-0.50
POSITIVE LOGITS
opa
0.81
chal
0.78
uana
0.73
ela
0.71
quin
0.67
idate
0.65
adr
0.64
Wynne
0.63
atis
0.62
wine
0.61
Activations Density 0.111%