INDEX
Explanations
text related to policy recommendations and government actions regarding discrimination and work conditions
New Auto-Interp
Negative Logits
fame
-0.79
death
-0.76
Strange
-0.76
weird
-0.75
fuck
-0.75
mysteriously
-0.72
Legend
-0.71
unlucky
-0.71
lol
-0.71
pissed
-0.70
POSITIVE LOGITS
incentiv
1.11
equitable
1.11
interventions
1.10
priorit
1.09
initiatives
1.07
implementing
1.07
reforms
1.05
safeguards
1.03
strengthening
1.03
tackling
1.03
Activations Density 0.749%