INDEX
Explanations
references to ownership and responsibility
New Auto-Interp
Head Attr Weights
0:0.05
1:0.02
2:0.07
3:0.15
4:0.25
5:0.04
6:0.04
7:0.05
8:0.06
9:0.06
10:0.05
11:0.07
Negative Logits
advertisement
-1.70
Frie
-1.48
.)
-1.42
Slug
-1.42
glers
-1.42
Luk
-1.40
Canaver
-1.39
phabet
-1.38
depending
-1.34
apiece
-1.34
POSITIVE LOGITS
intend
1.78
アル
1.43
ught
1.41
gans
1.40
faire
1.36
ever
1.32
igne
1.27
dare
1.26
hesitate
1.25
justice
1.25
Activations Density 0.026%