INDEX
    Explanations

    punctuation marks, particularly commas

    New Auto-Interp
    Negative Logits
    <bos>
    -2.76
    -1.38
    
    
    -1.12
    <?
    -1.05
    <?
    
    -0.96
    /***
    
    -0.84
    /**
    -0.82
     springfox
    -0.74
    },[])
    -0.70
    //});
    -0.69
    POSITIVE LOGITS
     unspeak
    0.68
     maneu
    0.61
     impra
    0.60
     iirc
    0.57
     indescri
    0.57
    Abuse
    0.56
     unexplo
    0.54
     pleins
    0.53
     véhic
    0.53
     beverly
    0.52
    Act Density 0.296%

    No Known Activations