INDEX
Explanations
marginalized, vulnerable, or disadvantaged people
New Auto-Interp
Negative Logits
слегка
0.45
流畅
0.44
颢
0.41
सर्ट
0.41
collim
0.39
bothers
0.39
calibration
0.39
noticeable
0.38
약간
0.38
মোটামুটি
0.38
POSITIVE LOGITS
poverty
1.02
marginalized
0.95
disenfranch
0.89
impoverished
0.88
marginalised
0.84
vulner
0.82
vulnerable
0.82
disadvantaged
0.82
pobreza
0.82
Poverty
0.81
Activations Density 0.033%