INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    ಳಿಗೆ
    0.78
     വേദിക
    0.74
    人に
    0.73
     சுவாச
    0.72
    対処
    0.71
     تتع
    0.71
     dijadikan
    0.70
     wobec
    0.70
    ))+\
    0.70
    vika
    0.70
    POSITIVE LOGITS
     from
    1.83
     From
    1.83
    From
    1.77
    from
    1.76
    1.71
     FROM
    1.66
    1.64
    是从
    1.63
    FROM
    1.58
     desde
    1.58
    Act Density 0.271%

    No Known Activations