INDEX
Explanations
instances of the word "because."
New Auto-Interp
Head Attr Weights
0:0.02
1:0.01
2:0.08
3:0.11
4:0.15
5:0.04
6:0.05
7:0.18
8:0.03
9:0.04
10:0.07
11:0.15
Negative Logits
scription
-1.68
scribe
-1.56
daq
-1.55
ipel
-1.50
icter
-1.49
urnal
-1.45
yles
-1.44
eli
-1.44
ocket
-1.40
lashes
-1.39
POSITIVE LOGITS
Overs
1.61
Ü
1.53
atown
1.43
≡
1.42
Friedrich
1.40
exha
1.40
cumbers
1.40
Wolfgang
1.39
Spiegel
1.39
Jung
1.37
Activations Density 0.003%