INDEX
Explanations
words related to medical organizations and healthcare activities
New Auto-Interp
Negative Logits
(
-0.57
publiques
-0.57
까지
-0.51
as
-0.49
montón
-0.48
in
-0.47
vš
-0.47
\&
-0.45
-
-0.45
/
-0.45
POSITIVE LOGITS
the
1.30
our
1.07
another
1.01
their
0.96
its
0.93
his
0.92
her
0.80
an
0.80
both
0.80
one
0.78
Activations Density 0.957%