INDEX
    Explanations

    social, emotional, political, legal, sexual, racial categories

    economic, legal, social, emotional, financial, sexual aspects

    New Auto-Interp
    Negative Logits
    ي
    0.55
    يا
    0.54
    وم
    0.53
    م
    0.52
    एस
    0.51
    ير
    0.50
    يل
    0.50
    0.48
    ك
    0.47
    0.47
    POSITIVE LOGITS
    ;
    0.54
    al
    0.46
    )
    0.42
    o
    0.41
    ATING
    0.40
    UGH
    0.40
    <0x80>
    0.39
     a
    0.38
    akn
    0.38
     Bupati
    0.38
    Act Density 3.065%

    No Known Activations