INDEX
Explanations
mathematical symbols and notations
New Auto-Interp
Negative Logits
y
-1.81
e
-1.30
o
-1.27
i
-1.26
t
-1.25
n
-1.25
d
-1.20
u
-1.17
l
-1.16
s
-1.13
POSITIVE LOGITS
myſelf
1.89
themſelves
1.80
pleaſure
1.73
purpoſe
1.72
itſelf
1.72
Efq
1.67
ſelves
1.66
reaſon
1.65
himſelf
1.65
Anſ
1.65
Activations Density 1.170%