INDEX
Explanations
the names of individuals associated with sports teams
parentheses and related structures in the text
New Auto-Interp
Negative Logits
following
-0.70
hitters
-0.69
consist
-0.66
normalized
-0.66
shaming
-0.65
demographic
-0.65
broadly
-0.65
ingredients
-0.65
overl
-0.64
circulating
-0.63
POSITIVE LOGITS
USA
1.04
Atl
1.04
who
0.98
voc
0.97
University
0.97
Assistant
0.97
cox
0.92
Columb
0.91
Ph
0.89
lead
0.89
Activations Density 0.107%