INDEX
Explanations
attends to specific tokens marked as ** from various tokens marked as [[ ]]
New Auto-Interp
Head Attr Weights
0:0.23
1:0.23
2:0.17
3:0.07
4:0.07
5:0.03
6:0.06
7:0.09
Negative Logits
uniqlo
-0.46
tôi
-0.45
*/;
-0.45
greateſt
-0.45
endphp
-0.45
Shakspeare
-0.45
myſelf
-0.45
};*/
-0.44
?</
-0.44
fromnode
-0.43
POSITIVE LOGITS
'
0.33
<eos>
0.32
n
0.31
:
0.31
л
0.31
1
0.31
tr
0.30
ason
0.29
ann
0.29
0.28
Activations Density 0.148%