INDEX
    Explanations

    expressions of outrage and frustration regarding social issues and personal experiences

    New Auto-Interp
    Negative Logits
    лаÑĤÑĥ
    -0.16
    zn
    -0.16
    ä¸įå¾Ĺ
    -0.15
     Depends
    -0.14
    iem
    -0.14
    Trivia
    -0.14
    usch
    -0.14
    åĶ
    -0.14
    pron
    -0.13
    igua
    -0.13
    POSITIVE LOGITS
     Wake
    0.27
     wake
    0.27
    wake
    0.26
    Wake
    0.26
     Grow
    0.24
    Grow
    0.22
     stop
    0.22
     why
    0.22
     Stop
    0.21
     grow
    0.20
    Act Density 0.293%

    No Known Activations