INDEX
    Explanations

    expressions of personal concern and uncertainty

    Follows single characters or short strings

    swear words, expletives, and punctuation

    New Auto-Interp
    Negative Logits
     Paglinawan
    -1.02
    NameInMap
    -0.92
    <bos>
    -0.91
    `;
    
    -0.89
    '},
    
    -0.88
    >`;
    -0.88
     كومونز
    -0.85
    GenerationType
    -0.85
     */
    
    
    -0.84
    [])
    
    -0.83
    POSITIVE LOGITS
    .
    1.07
     fucking
    0.89
    ,
    0.86
    !
    0.83
    0.79
     freakin
    0.76
     fuckin
    0.73
     FUCKING
    0.69
     stuff
    0.67
    ….
    0.66
    Act Density 0.534%

    No Known Activations