INDEX
Explanations
instances of the word "go"
New Auto-Interp
Head Attr Weights
0:0.08
1:0.08
2:0.08
3:0.08
4:0.06
5:0.08
6:0.08
7:0.08
8:0.08
9:0.09
10:0.07
11:0.08
Negative Logits
ciating
-2.03
Palestin
-1.94
Iro
-1.89
Elections
-1.86
Constitution
-1.83
AFB
-1.83
Tackle
-1.77
ENT
-1.76
Bucks
-1.76
ousing
-1.75
POSITIVE LOGITS
perse
2.27
hindsight
2.13
Repl
2.09
repl
2.07
Sov
2.06
REDACTED
2.01
replacements
2.01
stockp
1.99
sav
1.94
ゴ
1.94
Activations Density 0.000%