INDEX
Explanations
terms related to identification and classification
New Auto-Interp
Negative Logits
of
-0.48
“
-0.47
than
-0.46
on
-0.46
...
-0.46
at
-0.44
,
-0.43
also
-0.43
.
-0.43
be
-0.42
POSITIVE LOGITS
itſelf
1.28
myſelf
1.27
pleaſure
1.25
Jefus
1.25
IFIC
1.23
IFICATION
1.20
ific
1.20
ification
1.18
purpoſe
1.18
houſe
1.18
Activations Density 0.447%