INDEX
    Explanations

    authorship and communication

    New Auto-Interp
    Negative Logits
     betont
    0.79
     harness
    0.74
     seek
    0.74
     seekers
    0.74
     emph
    0.73
     bask
    0.72
     seeker
    0.71
    寻求
    0.71
     embodied
    0.70
     encompassed
    0.70
    POSITIVE LOGITS
     authored
    1.50
     sent
    1.27
    authored
    1.16
     작성
    1.09
     produced
    1.09
     submitted
    1.09
     published
    1.09
     erstellt
    1.08
     Sent
    1.06
     issued
    1.06
    Act Density 0.088%

    No Known Activations