INDEX
    Explanations

    dollar amounts following dollar signs

    New Auto-Interp
    Negative Logits
    та
    0.63
    0.63
    なります
    0.62
    ۰
    0.60
    ுடன்
    0.59
    е
    0.59
    ہ
    0.58
    0.58
    о
    0.57
    0.57
    POSITIVE LOGITS
     $
    0.87
     a
    0.73
    د
    0.67
    a
    0.67
    d
    0.64
    S
    0.62
     '
    0.60
    daki
    0.57
     on
    0.57
    t
    0.55
    Act Density 0.150%

    No Known Activations