INDEX
Explanations
variations of the word "to."
New Auto-Interp
Negative Logits
ſtate
-1.04
purpoſe
-1.01
auroit
-0.98
Geplaatst
-0.94
houſe
-0.92
feroit
-0.91
Inscrivez
-0.91
pleaſure
-0.89
avoient
-0.88
myſelf
-0.85
POSITIVE LOGITS
“
0.73
"])
0.73
"):
0.71
0.71
saites
0.70
ัพท์
0.65
be
0.64
%")
0.62
an
0.62
major
0.62
Activations Density 0.283%