INDEX
    Explanations

    proper nouns like names of people and organizations, especially related to politics or business

    New Auto-Interp
    Negative Logits
    rier
    -0.75
    holder
    -0.69
    ivities
    -0.69
    selves
    -0.68
    rap
    -0.66
    ishing
    -0.66
    atche
    -0.66
    ancies
    -0.65
    olk
    -0.62
    etheless
    -0.62
    POSITIVE LOGITS
    ppo
    1.17
    ffic
    0.99
    ctl
    0.97
    cean
    0.96
    zzi
    0.95
    active
    0.94
    ÄŁ
    0.93
    hazard
    0.91
    zza
    0.90
    pec
    0.88
    Act Density 0.033%

    No Known Activations