INDEX
    Explanations

    phrases that include the word "sort" in reference to categorization or comparison

    phrases that express classification or categorization

    New Auto-Interp
    Negative Logits
    interrupted
    -0.63
     Madison
    -0.62
    PLIC
    -0.61
    PER
    -0.60
    VICE
    -0.60
     Dent
    -0.59
    PLE
    -0.58
    INT
    -0.57
    NZ
    -0.56
    VIS
    -0.56
    POSITIVE LOGITS
    ilege
    0.88
    a
    0.84
    ies
    0.83
    ie
    0.82
    ative
    0.81
    liness
    0.81
    olith
    0.79
    iple
    0.77
    able
    0.77
    entially
    0.76
    Act Density 0.033%

    No Known Activations