INDEX
    Explanations

    sample, placeholder, dummy, replica, example

    New Auto-Interp
    Negative Logits
    the
    0.79
     the
    0.76
     an
    0.64
     a
    0.60
     The
    0.59
    t
    0.57
     AKA
    0.55
     THE
    0.53
    thed
    0.52
    ti
    0.51
    POSITIVE LOGITS
    Fmat
    0.60
    Ws
    0.59
    Ι
    0.59
    ንስ
    0.55
    clientes
    0.55
     plufieurs
    0.54
    ሎጂ
    0.54
     لأ
    0.53
    ഡ്
    0.52
    Texts
    0.52
    Act Density 0.229%

    No Known Activations