INDEX
Explanations
names of locations and events
New Auto-Interp
Negative Logits
thood
-0.85
natureconservancy
-0.73
ertodd
-0.71
obin
-0.67
xual
-0.66
versive
-0.66
osexual
-0.66
rez
-0.65
reated
-0.65
onym
-0.65
POSITIVE LOGITS
Calif
1.23
Colo
1.12
TX
1.02
Fla
1.00
TN
0.99
Md
0.98
Ala
0.98
Tenn
0.97
Ont
0.96
Illinois
0.94
Activations Density 0.574%