INDEX
Explanations
references to large-scale objects or activities
references to large-scale systems or phenomena
New Auto-Interp
Negative Logits
idges
-0.87
cair
-0.80
swick
-0.74
rium
-0.73
edin
-0.73
oother
-0.72
phis
-0.69
ipedia
-0.69
ournal
-0.69
Parsons
-0.69
POSITIVE LOGITS
fabrication
0.84
deployment
0.80
production
0.80
fulfillment
0.79
enter
0.76
ecological
0.74
scale
0.74
accident
0.74
adoption
0.73
undertaking
0.72
Activations Density 0.026%