INDEX
Explanations
instances of the letter 'c' in various contexts
New Auto-Interp
Negative Logits
l
-0.20
ri
-0.19
ono
-0.19
anza
-0.18
ish
-0.17
ysa
-0.16
lh
-0.16
onn
-0.16
lé
-0.16
alto
-0.15
POSITIVE LOGITS
̧
0.17
ho
0.16
Wade
0.15
vụ
0.15
ital
0.15
иÑĤа
0.15
iphers
0.14
айд
0.14
ensure
0.14
pron
0.14
Activations Density 0.158%