INDEX
Explanations
references to historical events and their context
New Auto-Interp
Head Attr Weights
0:0.13
1:0.44
2:0.02
3:0.02
4:0.02
5:0.15
6:0.01
7:0.02
8:0.04
9:0.06
10:0.02
11:0.02
Negative Logits
folder
-1.94
thinkable
-1.94
dropping
-1.93
ovember
-1.82
edged
-1.81
advertising
-1.79
munition
-1.78
wash
-1.75
pmwiki
-1.74
pl
-1.73
POSITIVE LOGITS
Monte
2.53
Eur
2.49
Kirin
2.41
Kin
2.32
Lank
2.26
Tri
2.26
Kenn
2.22
Chao
2.20
Mn
2.20
Hag
2.19
Activations Density 0.961%