INDEX
Explanations
-tion, -ity, and -ing words
New Auto-Interp
Negative Logits
'
1.24
AR
0.79
AN
0.78
I
0.77
IN
0.74
l
0.71
Is
0.70
에서는
0.70
AT
0.68
AG
0.68
POSITIVE LOGITS
and
1.55
be
1.41
or
1.30
are
1.28
σ
1.22
and
1.21
were
1.14
ud
1.12
us
1.11
can
1.11
Activations Density 2.036%