INDEX
Explanations
references to locations such as cities
mentions of the state of New York and its abbreviations
New Auto-Interp
Negative Logits
backer
-0.89
tenance
-0.70
olate
-0.69
iasis
-0.68
framework
-0.66
jamin
-0.65
balls
-0.65
ial
-0.64
icio
-0.63
oise
-0.63
POSITIVE LOGITS
RB
1.16
SE
1.07
HC
0.92
NY
0.91
CHR
0.86
WS
0.84
CHA
0.84
BI
0.83
RR
0.82
GOP
0.82
Activations Density 0.017%