INDEX
    Explanations

    names, especially ones that are repeated multiple times

    proper nouns, particularly names

    New Auto-Interp
    Negative Logits
    ivals
    -0.86
    atchewan
    -0.81
    ainment
    -0.75
    orters
    -0.74
     [+
    -0.73
    ãĥķãĤ©
    -0.72
    urgical
    -0.71
    ItemTracker
    -0.71
    urgy
    -0.71
    ablishment
    -0.68
    POSITIVE LOGITS
     Dee
    1.41
    zie
    0.88
    ples
    0.85
     Reeves
    0.83
    leigh
    0.80
    pling
    0.80
    pee
    0.76
     Dodd
    0.72
    ffe
    0.72
    ble
    0.72
    Act Density 0.007%

    No Known Activations