INDEX
    Explanations

    proper nouns, particularly names of people, companies, and organizations

    New Auto-Interp
    Negative Logits
    )=(
    -0.64
     Titanic
    -0.64
     DPR
    -0.57
     silence
    -0.56
     Underworld
    -0.56
     needles
    -0.55
     inconsistency
    -0.55
     Comet
    -0.54
     root
    -0.54
     Rebellion
    -0.54
    POSITIVE LOGITS
    eworks
    1.03
    isine
    0.94
    isons
    0.86
    anche
    0.86
    omm
    0.81
    arel
    0.81
    inton
    0.79
    ernels
    0.78
    aches
    0.77
    ules
    0.77
    Act Density 0.411%

    No Known Activations