INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    <bos>
    -2.61
    -1.03
    
    
    -1.02
    /***
    
    -0.99
    <?
    -0.96
    /**
    -0.91
    posób
    -0.90
    <?
    
    -0.85
    /*++
    -0.77
    <>
    
    -0.73
    POSITIVE LOGITS
     lele
    1.30
     summary
    1.29
     thuy
    1.28
     Summary
    1.26
     wien
    1.25
     meis
    1.20
     myn
    1.16
     fei
    1.13
     aen
    1.12
     SUMMARY
    1.12
    Act Density 0.176%

    No Known Activations