INDEX
    Explanations

    phrases indicating emotional reflections and social critiques

    New Auto-Interp
    Head Attr Weights
    0:0.09
    1:0.13
    2:0.03
    3:0.04
    4:0.02
    5:0.37
    6:0.02
    7:0.02
    8:0.09
    9:0.04
    10:0.06
    11:0.03
    Negative Logits
     Lic
    -1.88
    */(
    -1.75
     Membership
    -1.75
    Spider
    -1.70
    itamin
    -1.69
    employ
    -1.64
    cium
    -1.64
    cies
    -1.63
    venue
    -1.62
     Annotations
    -1.61
    POSITIVE LOGITS
    ��
    2.03
     nightmares
    1.94
     vain
    1.92
    velt
    1.90
     shudder
    1.88
     wonder
    1.79
     Kurd
    1.75
     Romero
    1.72
     doubt
    1.71
     remembering
    1.71
    Act Density 0.098%

    No Known Activations