INDEX
Explanations
occurrences of the letter 'o'
New Auto-Interp
Negative Logits
Theſe
-1.03
myſelf
-0.90
་་
-0.87
Beſ
-0.86
himſelf
-0.86
Monfieur
-0.83
itſelf
-0.82
faſt
-0.82
?";
-0.81
neſs
-0.80
POSITIVE LOGITS
O
1.69
o
1.63
O
1.54
oocytes
1.25
o
1.13
nO
1.05
afone
1.01
cO
1.00
о
0.96
o
0.94
Activations Density 0.068%