INDEX
Explanations
phrases related to societal issues and controversies
connections to various issues related to social inequality and justice
New Auto-Interp
Negative Logits
uct
-0.75
umbn
-0.69
escription
-0.67
Optional
-0.66
76561
-0.62
owered
-0.62
Username
-0.60
skin
-0.60
cuff
-0.59
EED
-0.59
POSITIVE LOGITS
deserves
1.18
certainly
1.13
consequently
1.10
indeed
1.09
hence
1.06
rightfully
1.06
particularly
1.03
exacerbated
1.01
therein
1.00
underscores
1.00
Activations Density 0.267%