INDEX
Explanations
terms related to social and political issues, such as racism, slavery, and controversial government policies
New Auto-Interp
Negative Logits
regrets
-0.67
trust
-0.66
CONTR
-0.65
Donation
-0.64
depends
-0.63
deficit
-0.59
sums
-0.59
IPM
-0.58
Decision
-0.58
fid
-0.57
POSITIVE LOGITS
abound
1.41
prolifer
1.24
rampant
1.09
everywhere
1.04
popping
1.02
emerge
0.99
prevail
0.99
thrive
0.96
bloom
0.96
circulate
0.95
Activations Density 0.783%