INDEX
    Explanations

    phrases indicating advice or caution

    New Auto-Interp
    Negative Logits
    <bos>
    -3.18
    <?
    -0.73
    /***
    
    -0.67
    USTAIN
    -0.61
    <tfoot>
    -0.60
     consolidate
    -0.56
     bestow
    -0.55
    /*
    -0.54
     onStop
    -0.54
    -0.54
    POSITIVE LOGITS
     One
    0.96
    One
    0.94
     ONE
    0.93
     santiago
    0.92
     gabri
    0.90
     sergio
    0.88
    ONE
    0.88
     lidl
    0.85
     maroc
    0.84
     véhic
    0.83
    Act Density 0.220%

    No Known Activations