INDEX
Explanations
occurrences of the word "will" and phrases indicating future actions or commitments
New Auto-Interp
Head Attr Weights
0:0.07
1:0.34
2:0.07
3:0.06
4:0.03
5:0.04
6:0.05
7:0.04
8:0.03
9:0.13
10:0.05
11:0.03
Negative Logits
ibrary
-2.93
azo
-2.76
isu
-2.74
alternative
-2.59
Previously
-2.57
ATER
-2.51
arted
-2.46
裏覚醒
-2.46
�士
-2.45
の�
-2.44
POSITIVE LOGITS
Keep
4.72
keep
4.52
keep
4.36
Keep
4.21
Keeping
3.99
keeping
3.93
keeps
3.81
kept
3.66
keeping
3.49
Keeping
3.42
Activations Density 0.003%