INDEX
Head Attr Weights
0:0.13
1:0.05
2:0.05
3:0.07
4:0.05
5:0.07
6:0.20
7:0.02
8:0.10
9:0.07
10:0.07
11:0.06
Negative Logits
Bj
-1.34
Bust
-1.32
ć
-1.31
Marin
-1.29
cavalry
-1.27
halla
-1.26
assignment
-1.24
pload
-1.24
Boise
-1.19
Thrones
-1.17
POSITIVE LOGITS
pecially
1.93
Attempts
1.66
Sadly
1.59
However
1.58
Moreover
1.55
USE
1.53
Therefore
1.52
Introdu
1.51
heat
1.51
Features
1.49
Activations Density 0.003%