INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     Clarkson
    -0.71
    forth
    -0.71
     Angus
    -0.71
     Jagu
    -0.69
     Borders
    -0.66
     Emerson
    -0.65
     appointments
    -0.64
     Eag
    -0.63
     Allied
    -0.62
     quotas
    -0.62
    POSITIVE LOGITS
    sexual
    0.81
    ria
    0.78
    cess
    0.78
    lder
    0.78
    vec
    0.78
     guest
    0.77
     ][
    0.76
    ird
    0.75
    lex
    0.75
    pert
    0.68
    Act Density 0.059%

    No Known Activations