INDEX
    Explanations

    proper names, particularly those of authors and contributors in academic contexts

    New Auto-Interp
    Negative Logits
    ubs
    -0.16
     Sala
    -0.16
     BÃŃ
    -0.16
    reib
    -0.15
    eks
    -0.15
    urve
    -0.14
    sov
    -0.14
    esson
    -0.14
    ooks
    -0.14
    imens
    -0.13
    POSITIVE LOGITS
    spacer
    0.14
    ugu
    0.14
     kitty
    0.14
    jav
    0.14
     ê·ľ
    0.14
    ushman
    0.13
    eck
    0.13
    ijľ
    0.13
    TRL
    0.13
    ost
    0.13
    Act Density 0.079%

    No Known Activations