INDEX
    Explanations

    Bring on / Ask me / Relax

    New Auto-Interp
    Negative Logits
    /
    0.36
            
    0.32
                    
    0.32
     +
    0.32
    ר
    0.31
     à
    0.30
     để
    0.30
     in
    0.30
     both
    0.30
     various
    0.30
    POSITIVE LOGITS
     pierwszy
    0.36
    どんどん
    0.34
     लिखिए
    0.34
    leyin
    0.33
    狠狠
    0.33
    0.33
     pierwsze
    0.32
     jezik
    0.31
    とにかく
    0.31
    0.31
    Act Density 0.158%

    No Known Activations