INDEX
Explanations
New Auto-Interp
Negative Logits
is
-2.00
has
-1.01
are
-0.84
was
-0.84
can
-0.77
Is
-0.75
will
-0.73
is
-0.72
isn
-0.72
may
-0.69
POSITIVE LOGITS
myſelf
1.16
ſaid
1.11
ſmall
1.10
uſed
1.09
raiſ
1.08
leſs
1.08
Personendaten
1.08
purpoſe
1.06
Personensuche
1.05
poffe
1.05
Activations Density 0.802%