INDEX
    Explanations

    high-frequency words and phrases indicating actions or states

    New Auto-Interp
    Negative Logits
    endor
    -0.17
    OTE
    -0.16
    ote
    -0.16
    itele
    -0.16
    rada
    -0.16
    889
    -0.15
    ae
    -0.15
    ke
    -0.14
    ampo
    -0.14
    anja
    -0.14
    POSITIVE LOGITS
     Maver
    0.15
    Çİ
    0.15
    αÏģά
    0.14
    redicate
    0.14
    ltra
    0.14
    çŃ
    0.14
    apy
    0.14
    áu
    0.14
    oine
    0.14
    reno
    0.14
    Act Density 0.022%

    No Known Activations