INDEX
    Explanations

    first names followed by last names

    New Auto-Interp
    Negative Logits
     inverted
    0.63
     skewed
    0.60
     
    0.57
     subtle
    0.57
     item
    0.57
     proverbial
    0.57
     zero
    0.56
     alpha
    0.56
     output
    0.55
     ac
    0.55
    POSITIVE LOGITS
    <unused118>
    0.87
    chyné
    0.83
     bergabung
    0.83
     великолеп
    0.83
    <unused1005>
    0.82
     וא
    0.82
     compañía
    0.81
    쮿
    0.80
    <unused1091>
    0.80
     melaksanakan
    0.80
    Act Density 0.060%

    No Known Activations