INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    contra
    -0.06
     com
    -0.06
    ACH
    -0.06
    -ul
    -0.06
    fieldname
    -0.06
    Understanding
    -0.06
     insistence
    -0.06
    iry
    -0.06
    (ans
    -0.06
     konuş
    -0.06
    POSITIVE LOGITS
     automobile
    0.07
     ++
    0.07
     enam
    0.07
     destined
    0.06
     (++
    0.06
    상을
    0.06
     --------
    0.06
    .literal
    0.06
     ****************************************************************************
    0.06
    0.06
    Act Density 0.011%

    No Known Activations