INDEX
Explanations
instances of the word "here."
New Auto-Interp
Head Attr Weights
0:0.20
1:0.15
2:0.04
3:0.05
4:0.04
5:0.08
6:0.06
7:0.04
8:0.08
9:0.07
10:0.05
11:0.09
Negative Logits
TPPStreamerBot
-1.86
Redditor
-1.79
Except
-1.73
Unknown
-1.64
Offline
-1.62
nown
-1.60
Subject
-1.56
WithNo
-1.56
◼
-1.52
NetMessage
-1.51
POSITIVE LOGITS
contrace
1.61
rica
1.51
hires
1.45
lie
1.40
vez
1.39
repo
1.38
gow
1.37
ogun
1.32
agre
1.32
q
1.30
Activations Density 0.000%