INDEX
    Explanations

    expressions of personal perspective and identification with specific groups

    New Auto-Interp
    Negative Logits
    emetery
    -0.15
     تÙĦÙģ
    -0.14
     lá»Ŀi
    -0.14
    urga
    -0.13
    Monkey
    -0.13
    OVID
    -0.13
    VRT
    -0.13
    ,default
    -0.13
    ads
    -0.13
    WISE
    -0.13
    POSITIVE LOGITS
     apt
    0.23
     gener
    0.20
    apt
    0.20
     affection
    0.20
     appropriately
    0.20
     loving
    0.20
     Apt
    0.19
     baptized
    0.18
     inform
    0.18
     misleading
    0.18
    Act Density 0.079%

    No Known Activations