INDEX
    Explanations

    actions or verbs related to undoing or removing something

    New Auto-Interp
    Negative Logits
    <bos>
    -2.80
    /***
    
    -0.79
    <?
    
    -0.78
    -0.74
    <?
    -0.69
    
    
    -0.69
    /**
    -0.68
    //*/
    -0.63
    //---
    -0.63
     incarcer
    -0.59
    POSITIVE LOGITS
     lele
    1.34
     bandung
    1.23
     jawa
    1.22
     Minang
    1.19
     jati
    1.16
     magis
    1.10
     jaya
    1.04
     saar
    1.00
     riva
    1.00
     kaos
    0.99
    Act Density 0.832%

    No Known Activations