INDEX
Explanations
phrases related to fairness or justice, especially in the context of treatment experienced by individuals or groups
phrases related to the concept of treatment, particularly in social or medical contexts
New Auto-Interp
Negative Logits
zyme
-0.74
sky
-0.66
enz
-0.60
frying
-0.60
dreaming
-0.60
raper
-0.58
brainstorm
-0.58
Downing
-0.58
Bour
-0.58
scramble
-0.57
POSITIVE LOGITS
reatment
0.89
payer
0.85
terness
0.83
ifference
0.82
reated
0.79
phia
0.77
Reviewer
0.77
treated
0.76
ttes
0.75
Gender
0.75
Activations Density 0.025%