INDEX
Explanations
interrogative phrases suggesting potential outcomes or inquiries
New Auto-Interp
Head Attr Weights
0:0.02
1:0.01
2:0.20
3:0.15
4:0.09
5:0.02
6:0.20
7:0.09
8:0.03
9:0.03
10:0.05
11:0.07
Negative Logits
ahime
-1.85
mercial
-1.45
hement
-1.41
staking
-1.37
ovies
-1.37
peror
-1.36
unctions
-1.35
uci
-1.32
visors
-1.30
iculty
-1.30
POSITIVE LOGITS
?'
1.66
?'"
1.59
?
1.59
?,
1.53
?!
1.51
?:
1.51
ogue
1.49
?!"
1.41
?」
1.40
Angular
1.40
Activations Density 0.007%