INDEX
Explanations
references to locations, particularly in relation to people and events
New Auto-Interp
Head Attr Weights
0:0.03
1:0.01
2:0.05
3:0.06
4:0.14
5:0.12
6:0.04
7:0.09
8:0.11
9:0.05
10:0.19
11:0.07
Negative Logits
20439
-2.59
acter
-2.04
rencies
-1.84
irit
-1.80
CLAIM
-1.80
gey
-1.78
arget
-1.77
ailed
-1.75
xon
-1.73
agher
-1.70
POSITIVE LOGITS
Cathedral
1.92
town
1.84
vacant
1.79
walls
1.78
ropolis
1.78
atown
1.76
Chinatown
1.73
sew
1.72
wide
1.72
nicer
1.71
Activations Density 0.024%