INDEX
    Explanations

    phrases that involve a division or categorization into two groups or types

    structures that categorize information or concepts

    New Auto-Interp
    Negative Logits
    board
    -0.69
    atown
    -0.68
    idth
    -0.67
    ritic
    -0.66
    -0.66
    zman
    -0.64
    Ł
    -0.64
    enger
    -0.63
    eez
    -0.62
    nell
    -0.61
    POSITIVE LOGITS
     halves
    1.01
     Firstly
    0.89
     viz
    0.84
     namely
    0.80
     sexes
    0.80
    hemat
    0.79
    Firstly
    0.76
     sides
    0.73
    \'
    0.72
     thirds
    0.70
    Act Density 0.231%

    No Known Activations