INDEX
Explanations
references to man-made phenomena or items
phrases containing references to human-made phenomena or impacts
New Auto-Interp
Negative Logits
byss
-0.93
alos
-0.89
osponsors
-0.85
achu
-0.82
yip
-0.82
earchers
-0.81
externalActionCode
-0.81
pload
-0.81
aucus
-0.80
asso
-0.80
POSITIVE LOGITS
disasters
0.85
disaster
0.79
goodies
0.73
horrors
0.72
pseudo
0.71
artificial
0.71
medi
0.69
goods
0.67
interventions
0.66
excuses
0.66
Activations Density 0.103%