INDEX
    Explanations

    numerical values and mathematical expressions

    Profanity and swear words

    strong negative emotion

    New Auto-Interp
    Negative Logits
    ]]
    
    -0.73
     niedersachsen
    -0.71
     umgekehrt
    -0.69
    intenant
    -0.67
     Wikiseite
    -0.67
     étoient
    -0.67
     }],
    -0.66
    seguida
    -0.65
     vieles
    -0.64
     giebt
    -0.63
    POSITIVE LOGITS
     fucking
    0.85
     FUCKING
    0.81
    fucking
    0.80
     goddamn
    0.78
    でございます
    0.78
    fuck
    0.77
     fuck
    0.76
    0.74
     fuckin
    0.73
     doth
    0.72
    Act Density 0.749%

    No Known Activations