INDEX
    Explanations

    entities related to health and medical conditions

    instances of proper nouns, particularly names and specific terms

    New Auto-Interp
    Negative Logits
    ed
    -0.70
     silenced
    -0.68
    e
    -0.65
    staking
    -0.65
    dar
    -0.65
     condensed
    -0.62
    LY
    -0.60
    ĪĴ
    -0.59
    Madison
    -0.59
     Bundy
    -0.59
    POSITIVE LOGITS
    atform
    1.18
    opl
    1.13
    asts
    1.10
    asso
    1.04
    asms
    1.01
    ifting
    1.00
    thora
    0.95
    ases
    0.95
    ifts
    0.95
    uten
    0.92
    Act Density 0.015%

    No Known Activations