INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     behaved
    -0.07
    eways
    -0.06
    ahr
    -0.06
     Avg
    -0.06
     payment
    -0.06
    -0.06
     Forge
    -0.06
    ril
    -0.06
    -0.06
     fracking
    -0.06
    POSITIVE LOGITS
    qn
    0.07
     concerted
    0.07
    ww
    0.07
     ****************
    0.07
     suicides
    0.06
     objeto
    0.06
    (router
    0.06
    -Jan
    0.06
    -pol
    0.06
    "/>
    ↵
    0.06
    Act Density 0.052%

    No Known Activations