INDEX
    Explanations

    phrases and actions related to communication and social interactions

    New Auto-Interp
    Negative Logits
    ichick
    -0.15
    aleb
    -0.15
    оза
    -0.15
    /respond
    -0.15
    umi
    -0.15
    vens
    -0.14
    iaux
    -0.14
    anager
    -0.14
    inks
    -0.14
    byss
    -0.14
    POSITIVE LOGITS
     accordingly
    0.20
     duly
    0.18
     Accordingly
    0.14
    δÎŃ
    0.14
    EntityType
    0.14
     Vec
    0.13
    ex
    0.13
    NR
    0.13
     armored
    0.13
    ãĥ¼ãĥ³
    0.13
    Act Density 0.277%

    No Known Activations