INDEX
Explanations
attends to "Go" from various related tokens representing actions or contexts
New Auto-Interp
Head Attr Weights
0:0.07
1:0.02
2:0.03
3:0.02
4:0.71
5:0.06
6:0.02
7:0.04
Negative Logits
back
-0.33
parti
-0.33
ob
-0.32
out
-0.32
Bush
-0.32
el
-0.31
-0.31
متعلقه
-0.30
af
-0.30
off
-0.30
POSITIVE LOGITS
líquida
0.56
decât
0.55
chinoise
0.55
grecque
0.52
traditionnelle
0.52
Sopho
0.52
ägg
0.52
griega
0.51
eenige
0.51
picioare
0.50
Activations Density 0.157%