INDEX
    Explanations

    emotional reactions and interpersonal dynamics

    New Auto-Interp
    Negative Logits
     embar
    -0.24
     (
    -0.17
     [
    -0.17
     “[
    -0.16
     perhaps
    -0.16
    ocator
    -0.16
     ![
    -0.15
    perhaps
    -0.15
     Paren
    -0.15
    rac
    -0.15
    POSITIVE LOGITS
     fucking
    0.25
     fuck
    0.25
     fucked
    0.23
     fucks
    0.23
     asshole
    0.22
     cazzo
    0.20
     assh
    0.20
    fuck
    0.20
     Fuck
    0.20
     tonight
    0.19
    Act Density 0.047%

    No Known Activations