INDEX
Head Attr Weights
0:0.08
1:0.07
2:0.10
3:0.07
4:0.09
5:0.08
6:0.08
7:0.08
8:0.08
9:0.07
10:0.07
11:0.07
Negative Logits
revolt
-2.12
Mean
-2.11
mockery
-2.04
1936
-1.98
Scha
-1.93
Sons
-1.90
Herm
-1.87
doomed
-1.87
spitting
-1.83
1896
-1.82
POSITIVE LOGITS
ら
2.43
taboola
2.30
microsoft
2.20
さ
2.18
き
2.16
wayne
2.13
eric
2.09
INESS
2.09
76561
2.05
oxin
2.03
Activations Density 0.000%