INDEX
    Explanations

    references to friendliness or positive social interactions

    New Auto-Interp
    Negative Logits
    ors
    -0.20
    iled
    -0.15
    CCA
    -0.15
    ÑĨÑİ
    -0.15
     Dit
    -0.15
    eday
    -0.15
    veis
    -0.15
    sr
    -0.14
    875
    -0.14
    sf
    -0.14
    POSITIVE LOGITS
    lier
    0.25
    liest
    0.20
    ships
    0.18
    liness
    0.18
     disposed
    0.17
     neighborhood
    0.17
     confines
    0.17
    acht
    0.17
     neighbourhood
    0.16
     Fam
    0.16
    Act Density 0.015%

    No Known Activations