INDEX
Explanations
references to the character "Holmes" with varying degrees of activation intensity
references to characters or elements from the Sherlock Holmes stories
New Auto-Interp
Negative Logits
ital
-0.86
nia
-0.79
icas
-0.78
ITIES
-0.77
illian
-0.74
arian
-0.74
acious
-0.71
itative
-0.71
imens
-0.70
ities
-0.70
POSITIVE LOGITS
Flavoring
1.01
mble
0.94
ĸļ
0.89
ecake
0.80
ï¸
0.77
lling
0.68
´
0.67
xual
0.66
ĪĴ
0.66
llers
0.64
Activations Density 0.058%