INDEX
    Explanations

    expressions of privilege and opportunities presented to individuals

    New Auto-Interp
    Negative Logits
    itus
    -0.16
    ãģ£ãģ
    -0.15
    loo
    -0.15
     Mog
    -0.15
    405
    -0.15
    -scalable
    -0.14
    perator
    -0.14
    rå
    -0.14
    âī¡
    -0.14
    ÑħодиÑĤÑĮ
    -0.13
    POSITIVE LOGITS
     privilege
    0.73
     pleasure
    0.65
    priv
    0.61
     priv
    0.60
     prive
    0.58
     Priv
    0.56
    Priv
    0.51
     PRIV
    0.51
     Ple
    0.50
    ple
    0.49
    Act Density 0.086%

    No Known Activations