INDEX
Explanations
copyright-related phrases and publication years
New Auto-Interp
Head Attr Weights
0:0.01
1:0.01
2:0.12
3:0.17
4:0.05
5:0.02
6:0.06
7:0.17
8:0.03
9:0.04
10:0.06
11:0.22
Negative Logits
iversity
-1.48
SourceFile
-1.34
prints
-1.34
Reviewer
-1.29
esty
-1.29
mite
-1.26
YC
-1.26
fees
-1.26
conservancy
-1.26
Atl
-1.25
POSITIVE LOGITS
atile
1.40
oters
1.39
icians
1.25
Leader
1.24
states
1.24
ammers
1.20
bind
1.19
Mew
1.19
depress
1.18
Belarus
1.17
Activations Density 0.003%