INDEX
Explanations
attends to the first punctuation from later tokens that are part of complex clauses
New Auto-Interp
Head Attr Weights
0:0.07
1:0.09
2:0.08
3:0.15
4:0.14
5:0.05
6:0.28
7:0.10
Negative Logits
تقاوى
-0.51
DeleteBehavior
-0.47
bezeichneter
-0.46
TagHelper
-0.46
-0.45
"..\..\..\
-0.45
ivelany
-0.45
abestanden
-0.45
WithIOException
-0.44
Aiheesta
-0.43
POSITIVE LOGITS
↵↵
0.23
于是
0.23
innocently
0.23
então
0.22
↵
0.21
olge
0.21
1
0.20
,
0.20
5
0.20
便
0.20
Activations Density 0.111%