INDEX
Explanations
names of individuals in news articles
instances of parentheses or textual information enclosed within parentheses
New Auto-Interp
Negative Logits
transact
-0.75
adjustments
-0.72
bump
-0.69
rave
-0.69
conver
-0.69
parity
-0.68
clin
-0.68
altogether
-0.66
procedural
-0.65
deem
-0.65
POSITIVE LOGITS
pictured
1.64
left
1.41
above
1.35
Photo
1.33
Courtesy
1.30
pron
1.26
shown
1.25
bottom
1.24
center
1.23
right
1.17
Activations Density 0.056%