INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    \}$
    0.40
     समीप
    0.37
    rti
    0.36
     लिखिए
    0.36
     위해
    0.36
    unnels
    0.36
     απε
    0.35
     ʻ
    0.35
    पक्ष
    0.34
    </caption>
    0.34
    POSITIVE LOGITS
     Bur
    0.47
     bur
    0.46
    bur
    0.40
     recyclable
    0.40
     BUR
    0.39
     brewing
    0.38
     Bucs
    0.37
    intuitive
    0.37
     Бу
    0.36
     intuitive
    0.36
    Act Density 0.007%

    No Known Activations