INDEX
    Explanations

    discussions of personal experiences, often with emotional or controversial implications

    New Auto-Interp
    Negative Logits
     deres
    -0.90
     ourselves
    -0.80
    彼らは
    -0.79
     yourselves
    -0.74
     deras
    -0.71
     eorum
    -0.67
     loro
    -0.65
     kanilang
    -0.65
     unison
    -0.60
    seamnă
    -0.60
    POSITIVE LOGITS
     himself
    1.86
     his
    1.82
    himself
    1.49
    his
    1.31
     seinem
    1.16
     seiner
    1.12
     kanyang
    1.04
     seines
    1.03
     dirinya
    1.00
     Himself
    0.98
    Act Density 1.236%

    No Known Activations