INDEX
    Explanations

    references to social roles and relationships within communities

    New Auto-Interp
    Negative Logits
    clude
    -0.16
    clud
    -0.15
    apiro
    -0.15
    agos
    -0.15
     present
    -0.15
    gu
    -0.14
    present
    -0.14
     tant
    -0.14
    embros
    -0.14
    allows
    -0.14
    POSITIVE LOGITS
     everywhere
    0.30
     shouldn
    0.26
     across
    0.23
     with
    0.22
     throughout
    0.20
     today
    0.20
     without
    0.20
     around
    0.19
     anywhere
    0.19
     Everywhere
    0.18
    Act Density 0.299%

    No Known Activations