INDEX
    Explanations

    proper nouns, specifically names of individuals

    New Auto-Interp
    Negative Logits
    ToProps
    -0.17
    enne
    -0.16
    stress
    -0.14
     Atkins
    -0.14
    antic
    -0.14
    armac
    -0.14
    ToWorld
    -0.13
    uddle
    -0.13
    apons
    -0.13
    vap
    -0.13
    POSITIVE LOGITS
    FLICT
    0.16
     Chronic
    0.14
     Fans
    0.14
    osi
    0.14
    λιά
    0.13
    968
    0.13
    ')['
    0.13
     Fil
    0.13
    UBLE
    0.12
     Ner
    0.12
    Act Density 0.071%

    No Known Activations