INDEX
Explanations
references to health and wellness, particularly focusing on "healthy" attributes and lifestyles
New Auto-Interp
Negative Logits
kasarigan
-0.83
GenerationType
-0.75
GGLE
-0.75
hefyd
-0.74
חיצוניים
-0.73
섰
-0.70
antula
-0.68
calendriers
-0.68
виправивши
-0.67
̀u
-0.67
POSITIVE LOGITS
healthy
1.91
Healthy
1.73
Healthy
1.72
healthy
1.71
healthier
1.40
healthiest
1.38
unhealthy
1.36
saludable
1.35
saudável
1.25
olesome
1.21
Activations Density 0.082%