INDEX
    Explanations

    words related to categorization and classifications, particularly in social or systematic contexts

    New Auto-Interp
    Negative Logits
    ors
    -0.69
    orate
    -0.27
    ions
    -0.27
    es
    -0.26
    or
    -0.26
    ion
    -0.26
    orial
    -0.25
    ed
    -0.23
    ori
    -0.23
    aar
    -0.23
    POSITIVE LOGITS
    tempt
    0.28
    rice
    0.27
    te
    0.25
    he
    0.25
    trib
    0.24
    rices
    0.21
    ivity
    0.20
    tributes
    0.20
    tempts
    0.20
    ricks
    0.19
    Act Density 0.112%

    No Known Activations