INDEX
    Explanations

    elements related to social interactions and group dynamics

    New Auto-Interp
    Negative Logits
    onymous
    -0.15
    ipop
    -0.15
    gens
    -0.14
    wj
    -0.14
    937
    -0.14
    ucas
    -0.14
    icon
    -0.14
    zell
    -0.13
     nackte
    -0.13
    gay
    -0.13
    POSITIVE LOGITS
     å£
    0.16
     глÑĥ
    0.14
    entions
    0.14
    [port
    0.13
    olf
    0.13
     KromÄĽ
    0.13
    ">ÃĹ</
    0.13
     stir
    0.13
     VÅ¡
    0.13
     jud
    0.13
    Act Density 0.130%

    No Known Activations