INDEX
Explanations
mentions of news and updates related to events and items
New Auto-Interp
Head Attr Weights
0:0.05
1:0.04
2:0.33
3:0.06
4:0.15
5:0.04
6:0.03
7:0.03
8:0.04
9:0.10
10:0.05
11:0.02
Negative Logits
ush
-1.29
commissions
-1.23
yip
-1.22
ussia
-1.17
weed
-1.16
vertisements
-1.15
flix
-1.14
olls
-1.13
vet
-1.12
fle
-1.12
POSITIVE LOGITS
vironment
1.48
TOP
1.47
WARE
1.46
DIS
1.42
ource
1.40
pta
1.36
enment
1.31
itia
1.30
ciating
1.29
MENTS
1.28
Activations Density 0.003%