INDEX
Explanations
terms related to health, medical practices, and their impacts on well-being
New Auto-Interp
Negative Logits
hir
-0.15
.She
-0.15
иÑĨ
-0.14
İR
-0.14
igma
-0.13
896
-0.13
åħ¸
-0.13
Them
-0.13
ãĤı
-0.13
ITS
-0.13
POSITIVE LOGITS
there
0.33
Ù쨥ÙĨ
0.32
we
0.30
it
0.30
they
0.27
thì
0.25
оно
0.24
ìļ°ë¦¬ëĬĶ
0.22
there
0.22
maka
0.21
Activations Density 0.903%