INDEX
Explanations
instances of the pronoun "it" and the letter "u"
New Auto-Interp
Head Attr Weights
0:0.02
1:0.02
2:0.06
3:0.06
4:0.06
5:0.04
6:0.41
7:0.06
8:0.04
9:0.05
10:0.08
11:0.05
Negative Logits
undred
-1.48
ournals
-1.47
��
-1.42
waukee
-1.34
boxing
-1.29
isite
-1.28
izable
-1.27
覚醒
-1.23
�
-1.22
readable
-1.22
POSITIVE LOGITS
stewards
1.50
Eisenhower
1.39
SHIP
1.19
celebr
1.18
eston
1.17
DAY
1.15
Sears
1.13
attendant
1.11
Orders
1.11
media
1.09
Activations Density 0.001%