INDEX
    Explanations

    proper nouns and specific words

    New Auto-Interp
    Negative Logits
     веществ
    0.77
     اصلی
    0.73
     rẻ
    0.68
     gebied
    0.67
     сторону
    0.65
     любой
    0.63
     bolsillo
    0.63
     машины
    0.61
     cajas
    0.61
     май
    0.61
    POSITIVE LOGITS
    as
    0.87
    o
    0.85
    oa
    0.83
    ievement
    0.82
    শংস
    0.81
    та
    0.80
    asati
    0.80
     Poems
    0.79
    endem
    0.78
    f
    0.78
    Act Density 0.005%

    No Known Activations