INDEX
    Explanations

    mentions of the word "them" to indicate an emphasis on a specific group or entity

    New Auto-Interp
    Negative Logits
     Racine
    -0.89
     purpoſe
    -0.79
     cauſe
    -0.75
     Monfieur
    -0.72
     uſe
    -0.71
     Efq
    -0.71
     pleaſure
    -0.70
     noel
    -0.68
     HRS
    -0.68
     Conci
    -0.68
    POSITIVE LOGITS
     themselves
    1.48
     Them
    1.29
    Them
    1.26
    themselves
    1.25
     them
    1.17
    selves
    1.17
     they
    1.06
     THEM
    1.05
     him
    1.01
    THEY
    0.98
    Act Density 0.037%

    No Known Activations