INDEX
    Explanations

    phrases related to making decisions or taking actions

    New Auto-Interp
    Negative Logits
    20439
    -0.71
    raught
    -0.69
    ²¾
    -0.68
    oubted
    -0.65
    ritional
    -0.64
    oubt
    -0.64
    tnc
    -0.62
    rocal
    -0.61
     actionGroup
    -0.61
    ounter
    -0.60
    POSITIVE LOGITS
     goddamn
    1.41
    *.
    1.18
    !
    1.17
    !!!
    1.17
     damn
    1.16
     godd
    1.14
     fucking
    1.14
    !!!!!!!!
    1.11
    !!!!
    1.11
    !!!!!
    1.10
    Act Density 1.008%

    No Known Activations