INDEX
Explanations
instances of the word "is."
New Auto-Interp
Head Attr Weights
0:0.09
1:0.07
2:0.09
3:0.09
4:0.04
5:0.09
6:0.10
7:0.04
8:0.08
9:0.10
10:0.12
11:0.04
Negative Logits
drown
-1.30
blers
-1.29
xff
-1.26
xes
-1.24
cks
-1.23
turbines
-1.20
�
-1.19
alions
-1.19
inces
-1.18
phony
-1.18
POSITIVE LOGITS
ende
2.05
behav
1.80
arrang
1.51
streng
1.50
awa
1.49
endars
1.48
���
1.45
warr
1.45
endum
1.44
terminology
1.37
Activations Density 0.000%