INDEX
    Explanations

    words related to legal rights and actions

    the definite article "the"

    New Auto-Interp
    Negative Logits
    thood
    -0.79
    iffe
    -0.70
    leeve
    -0.59
     suppose
    -0.58
    gat
    -0.58
    illon
    -0.58
    advertising
    -0.55
    den
    -0.55
    ius
    -0.55
    IDs
    -0.54
    POSITIVE LOGITS
    ses
    1.13
     same
    1.11
     quickest
    1.07
     slightest
    1.05
     longest
    1.04
     hardest
    1.04
     fastest
    1.02
     entirety
    0.97
     way
    0.97
     entire
    0.94
    Act Density 0.279%

    No Known Activations