INDEX
Explanations
phrases indicating emotional reflections and social critiques
New Auto-Interp
Head Attr Weights
0:0.09
1:0.13
2:0.03
3:0.04
4:0.02
5:0.37
6:0.02
7:0.02
8:0.09
9:0.04
10:0.06
11:0.03
Negative Logits
Lic
-1.88
*/(
-1.75
Membership
-1.75
Spider
-1.70
itamin
-1.69
employ
-1.64
cium
-1.64
cies
-1.63
venue
-1.62
Annotations
-1.61
POSITIVE LOGITS
��
2.03
nightmares
1.94
vain
1.92
velt
1.90
shudder
1.88
wonder
1.79
Kurd
1.75
Romero
1.72
doubt
1.71
remembering
1.71
Activations Density 0.098%