INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    cers
    -0.76
     foremost
    -0.70
    ACTED
    -0.70
    Interstitial
    -0.68
     minded
    -0.63
    riott
    -0.63
    nces
    -0.60
     Lauder
    -0.60
     Scots
    -0.59
     ransom
    -0.58
    POSITIVE LOGITS
    ombie
    1.40
    ebra
    1.24
    ombies
    1.22
    arro
    1.09
    odiac
    1.09
    iggurat
    1.05
    ealous
    1.01
    hou
    1.00
    eros
    1.00
    ymes
    0.99
    Act Density 3.849%

    No Known Activations