INDEX
Explanations
words related to human involvement in environmental changes or societal issues
New Auto-Interp
Negative Logits
osponsors
-0.95
alos
-0.93
byss
-0.87
asso
-0.85
skilled
-0.79
lust
-0.78
earchers
-0.78
yip
-0.78
achus
-0.77
oother
-0.76
POSITIVE LOGITS
disaster
0.79
version
0.77
disasters
0.76
pseudo
0.76
goodies
0.74
versions
0.73
garbage
0.70
horrors
0.69
interventions
0.69
goods
0.68
Activations Density 0.068%