INDEX
Explanations
instances of the word "go."
New Auto-Interp
Head Attr Weights
0:0.07
1:0.08
2:0.09
3:0.08
4:0.07
5:0.08
6:0.08
7:0.08
8:0.06
9:0.08
10:0.09
11:0.08
Negative Logits
que
-2.38
rall
-2.37
humanitarian
-2.26
iop
-2.15
advoc
-2.13
Palestin
-2.11
aho
-2.11
裏�
-2.10
NK
-2.10
rg
-2.10
POSITIVE LOGITS
bruises
2.12
bruising
2.10
hair
2.06
axy
2.00
idia
1.99
wives
1.99
levision
1.93
aura
1.87
Appearance
1.85
iring
1.84
Activations Density 0.000%