INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    liest
    -0.75
    idious
    -0.75
    glers
    -0.73
    keye
    -0.70
    erion
    -0.69
    abouts
    -0.68
    emaker
    -0.67
    roads
    -0.66
    mates
    -0.66
    gest
    -0.66
    POSITIVE LOGITS
     Hir
    0.90
     Abdullah
    0.85
     Singh
    0.85
     Ali
    0.84
     Hos
    0.83
     Kard
    0.83
     Quin
    0.82
     Har
    0.81
     Perez
    0.81
     Henderson
    0.79
    Act Density 0.178%

    No Known Activations