INDEX
    Explanations

    items that come in pairs or have two different options

    instances of categorization and classification into two groups or types

    New Auto-Interp
    Negative Logits
     certs
    -0.83
    enger
    -0.76
    board
    -0.70
     lett
    -0.67
    phi
    -0.64
     nowhere
    -0.62
     rattled
    -0.61
    endon
    -0.61
    odo
    -0.60
    ¬¼
    -0.60
    POSITIVE LOGITS
     namely
    1.06
     Firstly
    0.95
     viz
    0.87
    Firstly
    0.73
     sexes
    0.70
    hemat
    0.68
     ones
    0.64
     Olson
    0.61
    avorite
    0.61
     Wink
    0.61
    Act Density 0.234%

    No Known Activations