INDEX
    Explanations

    the word "normal" or variations of it

    the word "normal" and its various contexts related to societal standards and behaviors

    New Auto-Interp
    Negative Logits
    better
    -0.74
    hani
    -0.70
    artisan
    -0.68
    leted
    -0.66
    raped
    -0.64
    Sov
    -0.64
    intel
    -0.63
    resent
    -0.62
    hung
    -0.61
    Winged
    -0.61
    POSITIVE LOGITS
    cy
    1.50
    ization
    1.44
    izing
    1.40
    isation
    1.35
    ised
    1.34
    izes
    1.30
    ize
    1.23
    ising
    1.21
    ized
    1.19
    izations
    1.12
    Act Density 0.044%

    No Known Activations