INDEX
Explanations
questions and references to timing or occurrences of specific events
New Auto-Interp
Head Attr Weights
0:0.01
1:0.00
2:0.22
3:0.18
4:0.21
5:0.02
6:0.05
7:0.12
8:0.03
9:0.03
10:0.04
11:0.04
Negative Logits
wcsstore
-1.47
ipher
-1.46
livious
-1.38
inently
-1.35
bnb
-1.34
panel
-1.29
ividually
-1.29
76561
-1.26
cffffcc
-1.26
atari
-1.25
POSITIVE LOGITS
?).
1.73
?),
1.63
)?
1.62
etus
1.59
?,
1.58
??
1.56
??
1.53
?'
1.53
.?
1.52
?!
1.48
Activations Density 0.011%