INDEX
Head Attr Weights
0:0.08
1:0.08
2:0.08
3:0.08
4:0.07
5:0.07
6:0.09
7:0.07
8:0.07
9:0.09
10:0.09
11:0.08
Negative Logits
ade
-2.50
itar
-2.35
anca
-2.34
ades
-2.28
quit
-2.26
izons
-2.24
untarily
-2.23
morning
-2.21
ilipp
-2.20
upe
-2.15
POSITIVE LOGITS
ュ
2.30
HOW
2.21
verb
2.16
Static
2.15
BILITY
2.14
theless
2.12
THESE
2.09
��
2.09
util
2.06
WARE
1.99
Activations Density 0.000%