INDEX
Explanations
concepts related to social justice and racial issues
New Auto-Interp
Negative Logits
picioare
-0.63
pagină
-0.60
păr
-0.60
femei
-0.58
căr
-0.57
enfans
-0.57
gră
-0.55
dezelve
-0.55
băr
-0.55
—
-0.54
POSITIVE LOGITS
threw
0.90
thru
0.76
form
0.75
trough
0.70
buy
0.66
whit
0.65
Thru
0.65
weather
0.64
acc
0.63
thro
0.63
Activations Density 0.816%