INDEX
    Explanations

    proper nouns, specifically names of individuals

    New Auto-Interp
    Negative Logits
    ouz
    -0.15
    oze
    -0.14
    urm
    -0.14
    odal
    -0.14
    _Flag
    -0.14
    idl
    -0.13
    lrt
    -0.13
    ót
    -0.13
    ода
    -0.13
    renched
    -0.13
    POSITIVE LOGITS
    greg
    0.16
    eniable
    0.16
     Greg
    0.15
     Ernest
    0.14
    Greg
    0.14
    gili
    0.14
     Orig
    0.13
    onders
    0.13
    cul
    0.13
     Gregory
    0.13
    Act Density 0.102%

    No Known Activations