INDEX
    Explanations

    comparisons that highlight inequality or significant issues in society

    New Auto-Interp
    Negative Logits
    .mdl
    -0.15
    inci
    -0.15
    ubic
    -0.14
    atron
    -0.14
    onio
    -0.14
     flip
    -0.14
    erah
    -0.14
    меÑĪ
    -0.14
    inel
    -0.14
    å£
    -0.13
    POSITIVE LOGITS
     er
    0.15
    ãģ£ãģ¡
    0.14
    acman
    0.14
     æľŁ
    0.14
    rec
    0.14
    ipro
    0.14
     professions
    0.14
    sons
    0.14
     اÙĨج
    0.14
     ther
    0.13
    Act Density 0.161%

    No Known Activations