INDEX
    Explanations

    the cause or motive behind events or actions

    language related to motives and causes

    New Auto-Interp
    Negative Logits
    abo
    -0.78
    owship
    -0.73
    gard
    -0.71
    buff
    -0.70
    udo
    -0.67
    NRS
    -0.65
    mun
    -0.65
    OWS
    -0.64
    paio
    -0.64
    byter
    -0.63
    POSITIVE LOGITS
     behind
    1.18
    why
    1.03
     motivating
    0.99
     underlying
    0.97
     motives
    0.95
     culprit
    0.95
     why
    0.92
     responsible
    0.92
     motive
    0.92
     motivations
    0.90
    Act Density 0.273%

    No Known Activations