INDEX
    Explanations

    legal terminology and references to specific cases or legal entities

    New Auto-Interp
    Negative Logits
     their
    -0.97
    Their
    -0.95
     Their
    -0.92
     themselves
    -0.91
    themselves
    -0.89
    their
    -0.88
     THEIR
    -0.79
    他们的
    -0.73
    他們的
    -0.70
     who
    -0.63
    POSITIVE LOGITS
     its
    2.04
    Its
    1.91
     Its
    1.82
     itself
    1.81
    its
    1.57
     Itself
    1.52
    itself
    1.45
    它的
    1.40
     ITS
    1.28
     яке
    1.20
    Act Density 1.042%

    No Known Activations