INDEX
    Explanations

    specific characters or symbols that may indicate formatting or coding elements within the text

    New Auto-Interp
    Negative Logits
     "`
    -0.16
     "
    -0.15
     adversely
    -0.15
     hubby
    -0.14
     Totally
    -0.14
    .connector
    -0.13
     totally
    -0.13
    ":-
    -0.13
    Oops
    -0.13
    SF
    -0.13
    POSITIVE LOGITS
     fucking
    0.33
     fucked
    0.29
     fuck
    0.27
     Fucking
    0.27
     fucks
    0.25
     FUCK
    0.25
     Fuck
    0.24
    fuck
    0.24
     cunt
    0.23
     –↵
    0.22
    Act Density 0.004%

    No Known Activations