INDEX
Explanations
locations in a city
entities related to governance and authority figures
New Auto-Interp
Head Attr Weights
0:0.10
1:0.04
2:0.15
3:0.13
4:0.04
5:0.17
6:0.05
7:0.04
8:0.07
9:0.09
10:0.06
11:0.03
Negative Logits
Flavoring
-1.24
redundancy
-1.20
pmwiki
-1.20
Spoiler
-1.15
Redditor
-1.15
iversal
-1.14
Tube
-1.11
STEM
-1.10
monary
-1.10
Downloadha
-1.08
POSITIVE LOGITS
anc
1.33
ado
1.28
rouse
1.27
idd
1.26
ab
1.18
endi
1.18
ét
1.15
cent
1.15
otte
1.15
lements
1.14
Activations Density 0.010%