INDEX
Explanations
words related to health and healthcare topics
New Auto-Interp
Negative Logits
amy
-0.19
health
-0.18
здоÑĢов
-0.18
AMY
-0.17
ellen
-0.17
health
-0.17
hee
-0.16
ischen
-0.16
èĥĨ
-0.15
ella
-0.15
POSITIVE LOGITS
care
0.21
Care
0.19
abit
0.16
care
0.16
iest
0.16
enser
0.16
ymi
0.15
Humph
0.15
rica
0.15
reform
0.15
Activations Density 0.023%