INDEX
    Explanations

    references to individuals or groups in general terms

    New Auto-Interp
    Negative Logits
     χ
    -0.75
     δ
    -0.67
     SK
    -0.66
    δ
    -0.65
    Mot
    -0.64
     Mot
    -0.63
    d
    -0.63
    tas
    -0.62
     Δ
    -0.62
    TagHelpers
    -0.62
    POSITIVE LOGITS
     Nadie
    1.44
    anyone
    1.35
    everyone
    1.34
    everybody
    1.33
    nobody
    1.29
    Everyone
    1.29
     Everyone
    1.29
     perſon
    1.28
     Everybody
    1.28
    someone
    1.26
    Act Density 0.048%

    No Known Activations