INDEX
    Explanations

    elements of protest, critique, or strong expressions of dissent

    New Auto-Interp
    Negative Logits
    pite
    -0.15
    strand
    -0.14
    iteral
    -0.14
    owo
    -0.13
    _closure
    -0.13
    uyor
    -0.13
    olley
    -0.13
    hey
    -0.13
    quences
    -0.13
    ıc
    -0.13
    POSITIVE LOGITS
    /Open
    0.17
     decl
    0.14
    !
    0.14
    linky
    0.14
    obel
    0.14
    raft
    0.13
    ether
    0.13
    ise
    0.13
    ugen
    0.13
     plank
    0.12
    Act Density 0.198%

    No Known Activations