INDEX
Explanations
proper nouns, specifically names of cities and locations
New Auto-Interp
Head Attr Weights
0:0.02
1:0.01
2:0.24
3:0.04
4:0.06
5:0.03
6:0.05
7:0.11
8:0.03
9:0.03
10:0.21
11:0.12
Negative Logits
worldly
-1.46
haunt
-1.43
flashbacks
-1.36
orr
-1.33
glimps
-1.33
lasses
-1.33
ーテ
-1.32
classmates
-1.30
translation
-1.29
00200000
-1.28
POSITIVE LOGITS
Flag
1.72
ossier
1.48
determines
1.46
llor
1.45
unilaterally
1.42
Count
1.42
ependence
1.41
pires
1.40
unity
1.36
sovereign
1.35
Activations Density 0.018%