INDEX
    Explanations

    statements that assert or emphasize the existence or importance of a subject

    New Auto-Interp
    Negative Logits
    zin
    -0.08
    acid
    -0.07
    esi
    -0.06
    ead
    -0.06
    hiba
    -0.06
     dig
    -0.06
    etest
    -0.06
    amon
    -0.06
    zik
    -0.06
    _fwd
    -0.06
    POSITIVE LOGITS
     toward
    0.09
     towards
    0.09
     tw
    0.07
    ırak
    0.07
    onus
    0.06
    igli
    0.06
    .tw
    0.06
    оÑĢов
    0.06
    ä¹İ
    0.06
     how
    0.06
    Act Density 0.008%

    No Known Activations