INDEX
Explanations
occurrences of the word "of"
New Auto-Interp
Head Attr Weights
0:0.02
1:0.01
2:0.11
3:0.06
4:0.22
5:0.05
6:0.03
7:0.22
8:0.04
9:0.04
10:0.08
11:0.07
Negative Logits
phia
-1.36
cca
-1.30
SHA
-1.29
ppa
-1.24
lihood
-1.24
bows
-1.19
SHIP
-1.18
Merit
-1.16
anus
-1.15
iflower
-1.15
POSITIVE LOGITS
mism
1.38
favourable
1.34
knowledgeable
1.32
teams
1.27
clues
1.25
volunteers
1.24
specific
1.24
perspectives
1.23
diagrams
1.21
brackets
1.20
Activations Density 0.001%