INDEX
    Explanations

    words related to societal norms and cultural beliefs

    references to societal norms and cultural practices

    New Auto-Interp
    Negative Logits
    ded
    -0.71
    ש
    -0.69
    aman
    -0.68
    amen
    -0.68
    MER
    -0.68
    onement
    -0.68
    Interstitial
    -0.67
    amaz
    -0.66
    CLASSIFIED
    -0.66
    ×ŀ
    -0.65
    POSITIVE LOGITS
    ystem
    1.11
    pace
    1.02
    pring
    1.02
    cape
    1.01
    omething
    1.00
    mith
    0.99
     affecting
    0.96
    hops
    0.95
    hip
    0.92
     influencing
    0.89
    Act Density 0.276%

    No Known Activations