INDEX
Explanations
words related to societal issues and disparities
references to societal and economic challenges
New Auto-Interp
Negative Logits
Adds
-0.81
ggles
-0.81
tains
-0.80
ebin
-0.80
Has
-0.78
has
-0.74
WATCH
-0.74
Update
-0.73
Looks
-0.73
osponsors
-0.71
POSITIVE LOGITS
depended
1.05
mattered
1.05
tended
1.02
lacked
0.94
outnumbered
0.91
ranged
0.85
amounted
0.83
flowed
0.83
they
0.81
belonged
0.81
Activations Density 1.062%