INDEX
    Explanations

    terms related to specific attributes or characteristics

    terms related to specificity in various contexts

    New Auto-Interp
    Negative Logits
    rican
    -0.76
    Bush
    -0.74
    former
    -0.72
    =-=-=-=-
    -0.72
    ruary
    -0.70
    Jenn
    -0.69
    kj
    -0.68
    NER
    -0.68
    http
    -0.68
    TPP
    -0.68
    POSITIVE LOGITS
    ities
    1.14
    ally
    1.04
    iveness
    0.97
    izations
    0.89
    ivity
    0.85
    ality
    0.85
    iations
    0.84
    itarian
    0.83
    ileged
    0.83
    ALLY
    0.79
    Act Density 0.020%

    No Known Activations