INDEX
    Explanations
    No Explanations Found
    New Auto-Interp
    Negative Logits
    orno
    -0.74
    elsen
    -0.72
    artney
    -0.71
    istries
    -0.71
    ousand
    -0.69
    irez
    -0.65
     cleanup
    -0.64
     volunteer
    -0.64
     amnesty
    -0.63
    asions
    -0.61
    POSITIVE LOGITS
    scar
    0.79
    -+-+
    0.73
    bid
    0.69
     Sph
    0.68
    morrow
    0.66
    scrib
    0.65
     Oswald
    0.65
    vest
    0.65
     condem
    0.65
     Fraz
    0.64
    Act Density 0.000%

    No Known Activations

    This feature has no known activations.