INDEX
    Explanations

    references to information and awareness of societal interactions

    New Auto-Interp
    Negative Logits
    rett
    -0.16
     inhabited
    -0.16
    own
    -0.15
    iales
    -0.15
     ton
    -0.15
     R
    -0.14
     exact
    -0.14
     pros
    -0.14
    avez
    -0.14
    677
    -0.14
    POSITIVE LOGITS
    ADOW
    0.19
    pery
    0.17
    ãĥķãĥ¬
    0.17
    ORK
    0.17
     BindingFlags
    0.16
    boru
    0.16
    yah
    0.15
    ito
    0.15
    afone
    0.15
     Regional
    0.15
    Act Density 0.013%

    No Known Activations