INDEX
    Explanations

    social, political, or professional roles

    New Auto-Interp
    Negative Logits
    :
    1.45
    0
    1.23
     abstract
    1.20
     identical
    1.20
    ,
    1.18
     important
    1.14
     advantage
    1.09
     tha
    1.04
     kudos
    1.03
    )
    1.02
    POSITIVE LOGITS
    𝙖
    1.29
    1.29
    𝐞
    1.26
     clínica
    1.20
    1.16
    óloga
    1.15
    1.15
    𝑳
    1.15
    agna
    1.13
    presidente
    1.13
    Act Density 0.400%

    No Known Activations