INDEX
    Explanations

    punctuation and indicators of dialogue

    New Auto-Interp
    Negative Logits
     penetr
    -0.15
     tube
    -0.15
    kop
    -0.14
    kre
    -0.14
    cher
    -0.14
    ube
    -0.14
    rels
    -0.14
     tubes
    -0.14
     風
    -0.14
     lif
    -0.13
    POSITIVE LOGITS
    è¡
    0.17
     Walters
    0.16
    740
    0.15
    /fast
    0.15
     conc
    0.15
    662
    0.14
    741
    0.14
    345
    0.14
    409
    0.14
    ipse
    0.14
    Act Density 0.013%

    No Known Activations