INDEX
Explanations
terms related to eating disorders and their impact on individuals
New Auto-Interp
Negative Logits
nock
-0.16
Ľå»º
-0.15
_lazy
-0.14
ñana
-0.14
ught
-0.14
Ark
-0.14
iert
-0.14
uten
-0.14
nj
-0.14
lius
-0.14
POSITIVE LOGITS
bul
0.18
orex
0.17
åħ¸
0.15
Eating
0.15
demon
0.15
alore
0.15
obs
0.15
çĺ
0.15
eating
0.14
Body
0.14
Activations Density 0.011%