INDEX
Explanations
mentions of things considered slender or thin
references to the concept of being slim or thin
New Auto-Interp
Negative Logits
citation
-0.78
DCS
-0.74
rique
-0.65
Occupations
-0.64
Mobil
-0.63
Hosp
-0.63
NAT
-0.62
Watching
-0.61
Ù
-0.61
Abuse
-0.60
POSITIVE LOGITS
ming
1.18
med
1.08
tty
0.85
y
0.81
ples
0.79
waist
0.79
ethy
0.78
ily
0.77
ideshow
0.76
suit
0.75
Activations Density 0.012%