INDEX
    Explanations

    references to physical attractiveness, specifically the term "handsome."

    New Auto-Interp
    Negative Logits
    aval
    -0.16
    zer
    -0.15
    upa
    -0.15
    Äįka
    -0.15
    ivate
    -0.15
    715
    -0.14
    arsers
    -0.14
    ilig
    -0.14
    udeau
    -0.14
    887
    -0.14
    POSITIVE LOGITS
    riott
    0.15
    irt
    0.15
    å½¢
    0.14
     Bylo
    0.14
     Sachs
    0.14
    esson
    0.14
    sap
    0.14
    anned
    0.14
    vell
    0.14
    mere
    0.14
    Act Density 0.001%

    No Known Activations