INDEX
Explanations
attends to tokens expressing a contrast or condition from related affirmatives appearing later in the sequence
New Auto-Interp
Head Attr Weights
0:0.27
1:0.21
2:0.14
3:0.10
4:0.06
5:0.02
6:0.07
7:0.09
Negative Logits
ConstraintMaker
-0.33
GEBURTSDATUM
-0.29
tartalomajánló
-0.29
ویکیپدیای
-0.28
Clik
-0.28
newOwner
-0.28
makeConstraints
-0.28
StringTokenizer
-0.27
Monter
-0.27
ModelExpression
-0.27
POSITIVE LOGITS
ffet
0.30
énie
0.30
pters
0.30
Italijani
0.29
varande
0.28
Réponses
0.28
ieu
0.27
textAppearance
0.27
odly
0.27
GeneratedCode
0.26
Activations Density 0.626%