INDEX
Explanations
opening phrases or introductory elements in a text
New Auto-Interp
Negative Logits
Efq
-1.59
myſelf
-1.56
Monfieur
-1.48
pleaſure
-1.43
reaſon
-1.43
purpoſe
-1.43
―――――
-1.41
iſt
-1.40
whoſe
-1.38
Jefus
-1.38
POSITIVE LOGITS
s
0.82
</em>
0.77
e
0.76
</i>
0.72
o
0.68
0.68
i
0.68
t
0.67
!
0.65
)
0.65
Activations Density 0.107%