INDEX
Explanations
LaTeX formatting elements, particularly those used for mathematical expressions and equations
New Auto-Interp
Negative Logits
y
-1.29
l
-0.95
c
-0.94
z
-0.93
i
-0.93
x
-0.92
n
-0.91
h
-0.91
-
-0.88
p
-0.87
POSITIVE LOGITS
themſelves
1.70
myſelf
1.66
Efq
1.66
pleaſure
1.60
})*/
1.55
purpoſe
1.49
Anſ
1.48
Theſe
1.48
ſelves
1.47
uſed
1.46
Activations Density 0.932%