INDEX
Explanations
mentions of health insurance and healthcare in general
references to health-related topics or issues
New Auto-Interp
Negative Logits
xual
-0.78
Helpful
-0.71
Darkness
-0.69
arget
-0.68
Duo
-0.68
Reloaded
-0.67
Wilde
-0.66
Gentleman
-0.64
Phant
-0.64
noses
-0.63
POSITIVE LOGITS
care
1.19
amacare
1.13
care
1.09
Care
0.99
Care
0.97
iest
0.95
insurance
0.95
health
0.90
aceutical
0.90
ily
0.87
Activations Density 0.029%