INDEX
    Explanations

    references to personal identity and interpersonal relationships

    New Auto-Interp
    Negative Logits
    teenth
    -0.20
    phans
    -0.17
    ropolis
    -0.17
    uable
    -0.17
    xes
    -0.17
    alted
    -0.17
    instanc
    -0.17
    zers
    -0.17
    resse
    -0.16
    ımıza
    -0.16
    POSITIVE LOGITS
    gether
    0.56
    etheless
    0.48
    linear
    0.48
    existent
    0.46
    west
    0.46
    ductory
    0.46
    adays
    0.45
    neath
    0.44
    adecimal
    0.43
    selling
    0.40
    Act Density 0.586%

    No Known Activations