INDEX
Explanations
references to weight loss and healthy eating
New Auto-Interp
Negative Logits
↵
-0.20
;\↵
-0.17
–
-0.17
–
-0.17
ï¼Ĩ
-0.17
:↵
-0.15
...↵
-0.15
;↵
-0.15
...↵
-0.15
–↵
-0.15
POSITIVE LOGITS
Dat
0.42
Dating
0.34
Sites
0.33
Site
0.29
D
0.28
Dat
0.28
Dx
0.28
datings
0.26
Dating
0.26
sites
0.25
Activations Density 0.002%