INDEX
    Explanations

    instances of communication and social interaction

    New Auto-Interp
    Negative Logits
    chw
    -0.16
    ditor
    -0.15
    olta
    -0.15
    dera
    -0.15
    Ãły
    -0.15
    (Abstract
    -0.14
    table
    -0.14
    alytics
    -0.14
    erte
    -0.14
     когда
    -0.14
    POSITIVE LOGITS
     it
    0.28
     оно
    0.22
    inson
    0.21
     chances
    0.21
     they
    0.20
     itu
    0.19
    å®ĥ
    0.18
     odds
    0.18
     ê·¸ê²ĥ
    0.17
     воно
    0.17
    Act Density 0.181%

    No Known Activations