INDEX
    Explanations

    entities related to specific people like "Monroe" and "Lyon"

    proper nouns, particularly names of people and places

    New Auto-Interp
    Negative Logits
    arded
    -0.78
    unks
    -0.74
    arding
    -0.74
    abor
    -0.73
    reddits
    -0.72
    rets
    -0.70
    gravity
    -0.70
    rogen
    -0.70
    uden
    -0.69
    abytes
    -0.68
    POSITIVE LOGITS
    street
    0.91
    alties
    0.91
     Monroe
    0.87
     Lyon
    0.82
    court
    0.80
    selves
    0.79
    hurst
    0.76
     Superior
    0.75
    ville
    0.71
    ette
    0.70
    Act Density 0.038%

    No Known Activations