INDEX
    Explanations

    words that express inclusivity or commonly reference groups

    New Auto-Interp
    Negative Logits
    ThroughAttribute
    -0.76
    LabelTagHelper
    -0.76
    TagMode
    -0.70
     propOrder
    -0.70
     réguli
    -0.67
     greateſt
    -0.67
    ystema
    -0.64
     Cedric
    -0.64
     noires
    -0.61
     Theſe
    -0.61
    POSITIVE LOGITS
     đều
    0.83
     nhau
    0.58
     very
    0.56
    bajo
    0.56
    enumi
    0.56
     being
    0.55
     were
    0.54
    very
    0.54
    tocin
    0.53
    urti
    0.52
    Act Density 0.199%

    No Known Activations