INDEX
    Explanations

    quotes and interjections

    words and phrases indicating relationships and emotional interactions.

    New Auto-Interp
    Negative Logits
    Always
    -0.07
    (new
    -0.07
    Needed
    -0.07
    EXPECTED
    -0.06
    动物
    -0.06
     overweight
    -0.06
    Besides
    -0.06
    WHAT
    -0.06
    nějších
    -0.06
     Besides
    -0.06
    POSITIVE LOGITS
    cl
    0.06
    ulla
    0.06
    tak
    0.06
     refunds
    0.06
    eos
    0.06
    uctor
    0.06
    lj
    0.06
    directories
    0.06
     TRI
    0.06
     โดย
    0.06
    Act Density 0.059%

    No Known Activations