INDEX
Explanations
terms related to restrictions and permissions
New Auto-Interp
Negative Logits
myſelf
-1.08
itſelf
-1.01
purpoſe
-1.00
Jefus
-0.98
himſelf
-0.98
faſt
-0.98
Monfieur
-0.97
ſche
-0.96
pleaſure
-0.95
Theſe
-0.93
POSITIVE LOGITS
P
0.62
(
0.58
S
0.55
M
0.52
.
0.50
for
0.50
W
0.49
U
0.49
B
0.48
L
0.48
Activations Density 1.002%