INDEX
Explanations
phrases related to human-made or artificial entities
terms related to human-made processes and products
New Auto-Interp
Negative Logits
alos
-0.98
aucus
-0.91
esville
-0.85
earchers
-0.82
achu
-0.81
eeks
-0.80
irlf
-0.80
stadt
-0.80
osponsors
-0.77
ebus
-0.76
POSITIVE LOGITS
interventions
0.84
disasters
0.81
disaster
0.73
goods
0.69
pseudo
0.69
versions
0.68
excuses
0.68
substances
0.68
garbage
0.67
artificial
0.67
Activations Density 0.089%