INDEX
    Explanations

    references to friendship and community interactions

    New Auto-Interp
    Negative Logits
    uff
    -0.20
     Marino
    -0.17
    hani
    -0.16
    Ñģол
    -0.15
    ottes
    -0.15
    fortune
    -0.15
    gard
    -0.15
    uffles
    -0.15
    OLON
    -0.15
    enth
    -0.14
    POSITIVE LOGITS
    undi
    0.18
    аÑĢÑĩ
    0.16
    Ïģά
    0.15
    ober
    0.14
    umber
    0.14
    mal
    0.14
     fish
    0.14
    dar
    0.14
    Ïģγ
    0.13
     Fish
    0.13
    Act Density 0.046%

    No Known Activations