INDEX
Explanations
occurrences of the letter 's' in various contexts
New Auto-Interp
Negative Logits
ocket
-0.27
n
-0.23
p
-0.23
ys
-0.23
ql
-0.23
cript
-0.23
hip
-0.23
ub
-0.23
к
-0.22
c
-0.22
POSITIVE LOGITS
ras
0.17
rb
0.17
put
0.17
raman
0.17
osos
0.17
chez
0.16
os
0.16
oso
0.16
odal
0.16
meal
0.15
Activations Density 0.028%