INDEX
    Explanations

    references to individuals' names, particularly first initials followed by surnames

    New Auto-Interp
    Negative Logits
     unw
    -0.18
    areth
    -0.18
    aise
    -0.17
    ankan
    -0.16
    ules
    -0.16
    esy
    -0.16
    arty
    -0.16
    esh
    -0.15
    arya
    -0.15
    xz
    -0.15
    POSITIVE LOGITS
    icket
    0.22
    ourke
    0.21
    angel
    0.21
    attr
    0.20
    undle
    0.20
    oes
    0.20
    ober
    0.20
    ych
    0.19
    ivas
    0.19
    aptop
    0.19
    Act Density 0.028%

    No Known Activations