INDEX
    Explanations

    proper nouns, particularly people's names

    names of individuals or characters

    New Auto-Interp
    Negative Logits
    ually
    -0.74
    #$
    -0.72
    forced
    -0.64
    lihood
    -0.63
    acity
    -0.62
    Pokémon
    -0.62
    rophe
    -0.62
     Cyborg
    -0.59
     Creator
    -0.58
     corridors
    -0.58
    POSITIVE LOGITS
    mie
    0.90
    iners
    0.90
     Seym
    0.84
    sie
    0.83
    inery
    0.82
    zbollah
    0.80
    sat
    0.79
    alf
    0.79
    ginx
    0.79
    ineries
    0.78
    Act Density 0.042%

    No Known Activations