INDEX
    Explanations

    conversational phrases and informal speech patterns

    New Auto-Interp
    Negative Logits
    avr
    -0.15
    iphy
    -0.15
    åı¸
    -0.14
     COVER
    -0.14
     (
    -0.14
    afs
    -0.13
    atown
    -0.13
    ostel
    -0.13
    apur
    -0.13
    omens
    -0.13
    POSITIVE LOGITS
     fuck
    0.26
     man
    0.25
     fucked
    0.23
     fucks
    0.23
     shit
    0.22
    fuck
    0.21
    Fuck
    0.20
     cats
    0.19
     cat
    0.19
     Fuck
    0.19
    Act Density 0.001%

    No Known Activations