INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    renheit
    -0.75
    nell
    -0.70
    ilit
    -0.69
    xus
    -0.69
    biz
    -0.69
    vous
    -0.69
    plete
    -0.68
    vich
    -0.67
    scrib
    -0.66
    uably
    -0.66
    POSITIVE LOGITS
     sexes
    1.68
     sides
    1.44
     halves
    1.40
     genders
    1.37
     parties
    0.96
     extremes
    0.92
     coasts
    0.89
     Houses
    0.88
     ends
    0.86
     thirds
    0.82
    Act Density 2.436%

    No Known Activations