INDEX
    Explanations

    references to interpersonal relationships and personal interactions

    New Auto-Interp
    Negative Logits
    berapa
    -0.16
    erer
    -0.16
    teÅŁ
    -0.15
    رز
    -0.15
    æĦ¿
    -0.15
    stype
    -0.15
    ynet
    -0.14
    endir
    -0.14
    monds
    -0.14
    é¡ĺ
    -0.14
    POSITIVE LOGITS
    cha
    0.17
    eca
    0.15
    651
    0.15
    ady
    0.15
    ec
    0.15
    端
    0.14
    nek
    0.14
     dam
    0.14
    ini
    0.14
    ha
    0.14
    Act Density 0.450%

    No Known Activations