INDEX
    Explanations

    dialogue quotes and contractions

    New Auto-Interp
    Negative Logits
    <bos>
    -3.34
     intersper
    -1.54
    /***
    
    -1.50
     hentai
    -1.47
     embra
    -1.41
     pessi
    -1.35
     suspic
    -1.31
     milf
    -1.30
    
    
    -1.29
     encre
    -1.29
    POSITIVE LOGITS
    '
    0.81
    0.80
    s
    0.63
    i
    0.60
    A
    0.60
    mathrm
    0.59
    S
    0.59
    An
    0.58
    I
    0.58
    0.57
    Act Density 1.007%

    No Known Activations