INDEX
    Explanations

    proper names and titles, particularly those of people and literary works

    New Auto-Interp
    Negative Logits
    nt
    -0.23
    ma
    -0.22
    ning
    -0.20
    ro
    -0.20
    ries
    -0.20
    nya
    -0.20
    ness
    -0.20
    mon
    -0.19
    me
    -0.18
    soever
    -0.18
    POSITIVE LOGITS
    'nun
    0.21
    ffset
    0.21
    ’nun
    0.20
    gether
    0.20
    alesce
    0.19
    ject
    0.17
    hiba
    0.17
    ceph
    0.17
    ymous
    0.17
    elho
    0.16
    Act Density 0.512%

    No Known Activations