INDEX
    Explanations

    references to role models and their impact on communities

    New Auto-Interp
    Negative Logits
    íĦ¸
    -0.16
    igsaw
    -0.14
    Registrar
    -0.14
    511
    -0.13
    antu
    -0.13
    ạn
    -0.13
     Ñģна
    -0.12
    307
    -0.12
    aidu
    -0.12
    789
    -0.12
    POSITIVE LOGITS
     role
    0.70
     Role
    0.65
    role
    0.61
    Role
    0.56
     ROLE
    0.52
    -role
    0.52
    _role
    0.47
    (role
    0.43
    .role
    0.43
    ROLE
    0.43
    Act Density 0.157%

    No Known Activations