INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     Myers
    -0.07
     dorsal
    -0.07
    coffee
    -0.06
     annoy
    -0.06
     s
    -0.06
    iedy
    -0.06
    (theta
    -0.06
    arse
    -0.06
     Sessions
    -0.06
     Tasmania
    -0.06
    POSITIVE LOGITS
    .rdf
    0.08
    _Control
    0.07
    _,↵
    0.07
     ölüm
    0.07
    ุบาล
    0.06
     portraying
    0.06
     Fecha
    0.06
     turnout
    0.06
     webView
    0.06
    )__
    0.06
    Act Density 0.034%

    No Known Activations