INDEX
    Explanations

    referencing and qualifying statements related to particular contexts in discussions about behavior or actions

    New Auto-Interp
    Negative Logits
    them
    -0.82
    Them
    -0.80
     THEM
    -0.78
     őket
    -0.69
     Them
    -0.67
     henne
    -0.67
     honom
    -0.66
    これを
    -0.65
     jambes
    -0.63
     lui
    -0.63
    POSITIVE LOGITS
     there
    1.62
     they
    1.18
     we
    1.10
     the
    1.02
    there
    0.95
     individuals
    0.85
    )";
    
    0.85
     certain
    0.84
    )");
    
    0.84
     it
    0.82
    Act Density 1.331%

    No Known Activations