INDEX
    Explanations

    subjects and pronouns associated with personal actions or feelings

    New Auto-Interp
    Negative Logits
    ummer
    -0.16
    лÑĸд
    -0.16
    apr
    -0.16
    oly
    -0.16
    oven
    -0.15
    ITOR
    -0.15
    itor
    -0.14
    ế
    -0.14
    idon
    -0.14
    ov
    -0.14
    POSITIVE LOGITS
    oard
    0.15
     uveden
    0.15
    421
    0.15
    mia
    0.14
    getY
    0.14
    ener
    0.14
    mask
    0.14
    imli
    0.13
     tÆ°á»Ľng
    0.13
    ulario
    0.13
    Act Density 0.235%

    No Known Activations