INDEX
    Explanations

    phrases starting with quotations

    instances of direct speech

    New Auto-Interp
    Negative Logits
     princip
    -0.85
     flared
    -0.81
     deterrent
    -0.77
     bod
    -0.72
     cheek
    -0.72
     clut
    -0.72
     arri
    -0.71
     conve
    -0.71
     adv
    -0.70
     sund
    -0.70
    POSITIVE LOGITS
    Hey
    1.54
    hey
    1.41
    Oh
    1.39
    hello
    1.34
    Damn
    1.25
    why
    1.25
    Fuck
    1.25
    Look
    1.24
    Why
    1.23
    Okay
    1.23
    Act Density 0.049%

    No Known Activations