INDEX
    Explanations

    proper nouns, especially names

    New Auto-Interp
    Negative Logits
    20439
    -0.82
    scenes
    -0.73
    birds
    -0.69
     Tactics
    -0.68
    cells
    -0.68
    mallow
    -0.67
    bands
    -0.66
     sled
    -0.65
    clad
    -0.62
     Conduct
    -0.62
    POSITIVE LOGITS
    igible
    0.85
     Edu
    0.83
    undo
    0.81
    estine
    0.80
    aughed
    0.80
    ardo
    0.78
    ston
    0.77
    rative
    0.76
    ucer
    0.74
    itialized
    0.73
    Act Density 0.029%

    No Known Activations