INDEX
    Explanations

    instances of the word "it" and expressions of entertainment-related content

    New Auto-Interp
    Negative Logits
    bote
    -0.19
    ucci
    -0.17
    awa
    -0.16
    988
    -0.15
    _imm
    -0.14
    iquid
    -0.14
    nock
    -0.14
    nergy
    -0.14
    ergus
    -0.14
    ãĥĥãĥĹ
    -0.14
    POSITIVE LOGITS
    éĬĢè¡Į
    0.16
     Nar
    0.15
     nar
    0.15
     Hund
    0.14
     Extreme
    0.14
    yw
    0.14
     extreme
    0.14
     Od
    0.13
    jez
    0.13
    auled
    0.13
    Act Density 0.047%

    No Known Activations