INDEX
Explanations
references to the city of Cleveland
New Auto-Interp
Negative Logits
eller
-0.17
roj
-0.16
sch
-0.15
yon
-0.15
sin
-0.15
azzi
-0.15
ziej
-0.15
Stanton
-0.15
rych
-0.14
go
-0.14
POSITIVE LOGITS
cle
0.23
avage
0.21
Cle
0.20
Kle
0.18
ighton
0.17
-cut
0.17
mons
0.17
Verg
0.17
ary
0.17
aver
0.17
Activations Density 0.008%