INDEX
Explanations
sections or markers that denote changes in context or structure in the text
New Auto-Interp
Negative Logits
―――――
-1.04
iſt
-0.93
Rhestr
-0.92
étoient
-0.91
faſt
-0.87
avoient
-0.86
-0.85
Monfieur
-0.84
pleaſure
-0.83
ſelf
-0.81
POSITIVE LOGITS
1.39
sprozess
0.77
0.77
\{\\0.75
sproz
0.75
"
0.68
K
0.67
0.67
0.65
C
0.62
Activations Density 0.040%