INDEX
    Explanations

    personal names or pronouns in the text

    New Auto-Interp
    Negative Logits
    arth
    -0.18
    far
    -0.17
    ople
    -0.17
    aks
    -0.16
    aring
    -0.15
    isure
    -0.15
    opard
    -0.14
     lien
    -0.14
    apon
    -0.14
    699
    -0.14
    POSITIVE LOGITS
    issing
    0.17
    ãĥĥãĤ¯
    0.17
    ibel
    0.16
    ombs
    0.15
    lesen
    0.15
    jc
    0.15
    bout
    0.15
    ibling
    0.15
     MetroFramework
    0.15
    ukes
    0.15
    Act Density 0.066%

    No Known Activations