INDEX
Explanations
punctuation marks at the end of sentences or question/exclamations
punctuation
New Auto-Interp
Negative Logits
des
-0.52
de
-0.51
TAWA
-0.49
.
-0.48
p
-0.45
su
-0.45
k
-0.45
she
-0.45
sol
-0.43
'
-0.43
POSITIVE LOGITS
Monfieur
0.92
Diſ
0.90
itſelf
0.90
pleaſure
0.88
myſelf
0.87
saraba
0.86
purpoſe
0.83
Jefus
0.81
―――――
0.80
raiſ
0.80
Activations Density 1.123%