INDEX
Explanations
repeated instances of the letter 'z'
New Auto-Interp
Negative Logits
purpoſe
-1.17
Monfieur
-1.16
myſelf
-1.16
ſeveral
-1.13
pleaſure
-1.11
Theſe
-1.08
faſt
-1.06
themſelves
-1.05
viſ
-1.02
ſever
-1.02
POSITIVE LOGITS
Z
1.73
z
1.60
Z
1.49
z
1.20
S
1.06
C
1.05
K
1.04
l
0.98
D
0.97
w
0.97
Activations Density 0.068%