INDEX
    Explanations

    phrases indicating inequality or disparity in social or economic contexts

    New Auto-Interp
    Negative Logits
     Roskov
    -0.94
    Datuak
    -0.93
     nahilalakip
    -0.91
     الحره
    -0.90
     Normdatei
    -0.89
     utafitiHapana
    -0.79
    Rhestr
    -0.76
    ########.
    -0.75
     StatelessWidget
    -0.74
    tvguidetime
    -0.72
    POSITIVE LOGITS
    ward
    0.76
    wards
    0.70
    most
    0.58
     olev
    0.52
    WARD
    0.47
     felé
    0.47
    flow
    0.45
    finch
    0.43
    สุด
    0.42
    iség
    0.41
    Act Density 0.524%

    No Known Activations