INDEX
Explanations
terms associated with health or medical conditions and biological entities
New Auto-Interp
Negative Logits
ので
-0.58
ا
-0.52
a
-0.50
able
-0.50
in
-0.45
ability
-0.42
ao
-0.40
amaz
-0.40
an
-0.40
are
-0.40
POSITIVE LOGITS
med
0.64
mable
0.63
mers
0.62
mmm
0.58
ming
0.58
ization
0.55
soever
0.53
MING
0.53
mler
0.52
iliar
0.52
Activations Density 1.264%