INDEX
Explanations
terms related to different types of conditions or diseases
New Auto-Interp
Negative Logits
ed
-0.25
es
-0.22
al
-0.22
ita
-0.21
itel
-0.19
ti
-0.19
edn
-0.18
it
-0.18
ey
-0.18
ky
-0.18
POSITIVE LOGITS
omial
0.24
cola
0.24
abox
0.23
coln
0.21
nesota
0.21
ews
0.21
fty
0.20
ners
0.20
shaw
0.20
arily
0.20
Activations Density 0.134%