INDEX
Explanations
phrases related to mental health and medical conditions such as depression and Alzheimer's disease
New Auto-Interp
Negative Logits
Sources
-0.77
aque
-0.75
ateur
-0.72
Reply
-0.71
leigh
-0.70
Commerce
-0.68
lay
-0.67
quer
-0.66
skirts
-0.65
clip
-0.65
POSITIVE LOGITS
relapse
1.00
depression
0.94
depressive
0.91
diagnosis
0.85
symptoms
0.82
medication
0.79
disorder
0.77
worsen
0.76
mood
0.75
spiral
0.75
Activations Density 0.033%