INDEX
    Explanations

    expressions of surprise or frustration

    New Auto-Interp
    Negative Logits
     незавершена
    -0.84
     iſt
    -0.69
     ſy
    -0.68
    ^(@)
    -0.68
     vuitton
    -0.68
     NDEBUG
    -0.65
    ]-->
    -0.65
     idéia
    -0.64
     AssemblyCulture
    -0.61
     '\\;'
    -0.61
    POSITIVE LOGITS
     fucking
    0.71
     FUCKING
    0.66
    Fucking
    0.60
     lmao
    0.60
     goddamn
    0.59
    fucking
    0.59
     fuckin
    0.57
     realisation
    0.57
     Fucking
    0.56
    Fuck
    0.56
    Act Density 0.141%

    No Known Activations