INDEX
Explanations
various instances of the letter 's' in differing contexts
New Auto-Interp
Negative Logits
myſelf
-1.62
pleaſure
-1.56
purpoſe
-1.55
itſelf
-1.54
―――――
-1.51
raiſ
-1.45
fubject
-1.44
NUMX
-1.43
Anſ
-1.41
faſt
-1.39
POSITIVE LOGITS
s
1.34
his
1.27
His
1.03
1.02
my
0.98
his
0.98
His
0.93
Her
0.92
(
0.92
I
0.91
Activations Density 0.206%