INDEX
Explanations
coded structures or syntax, focusing on programming language notation or declarations
New Auto-Interp
Negative Logits
-1.06
.
-0.90
is
-0.85
,
-0.84
he
-0.82
O
-0.81
in
-0.81
’
-0.81
A
-0.80
(
-0.80
POSITIVE LOGITS
myſelf
1.85
purpoſe
1.73
itſelf
1.73
pleaſure
1.71
Monfieur
1.66
poffible
1.66
reaſon
1.64
fubject
1.63
greateſt
1.60
themſelves
1.59
Activations Density 0.042%