INDEX
    Explanations

    proper nouns, such as names of places, people, and organizations

    terms related to inquiries and evaluations

    New Auto-Interp
    Negative Logits
     disabling
    -0.62
     stimulating
    -0.62
     Introduced
    -0.60
    CLASSIFIED
    -0.57
     beginnings
    -0.57
    paren
    -0.57
    gradient
    -0.57
     Allows
    -0.56
     Casting
    -0.55
    ILA
    -0.55
    POSITIVE LOGITS
     belonged
    0.95
     hail
    0.89
     belong
    0.85
     wore
    0.84
     were
    0.81
     consisted
    0.80
     owe
    0.77
     behaved
    0.77
     knew
    0.75
     include
    0.74
    Act Density 0.304%

    No Known Activations