INDEX
Explanations
depression and mental health conditions
New Auto-Interp
Negative Logits
un
1.03
as
1.02
ır
0.97
т
0.93
{0.89
'
0.87
उच्चतम
0.87
ي
0.84
ុល
0.84
د
0.84
POSITIVE LOGITS
ти
0.83
chen
0.74
depression
0.70
ה
0.70
ci
0.68
siniz
0.68
Depression
0.68
⁵
0.67
🍱
0.67
ות
0.66
Activations Density 0.008%