INDEX
Explanations
various forms of punctuation and special characters in the text
New Auto-Interp
Negative Logits
bezeichneter
-1.67
Efq
-1.64
myſelf
-1.60
Personendaten
-1.59
pleaſure
-1.59
faſt
-1.58
ſever
-1.57
ſind
-1.56
ſelf
-1.56
―――――
-1.55
POSITIVE LOGITS
,
1.29
.
1.21
1.10
(
0.97
↵↵
0.96
-
0.91
/
0.91
to
0.90
-
0.88
(
0.88
Activations Density 6.164%