INDEX
    Explanations

    potential negative impacts

    The neuron flags terms that convey risk, threat or potential negative outcomes.

    New Auto-Interp
    Negative Logits
     Based
    -0.07
    lpVtbl
    -0.07
     Fil
    -0.07
    -0.06
     convers
    -0.06
    <Article
    -0.06
     Saudis
    -0.06
    crow
    -0.06
     fairy
    -0.06
    fld
    -0.06
    POSITIVE LOGITS
    -haired
    0.08
    )==
    0.08
    ONO
    0.07
    0.07
    μένη
    0.07
     demonic
    0.06
     Semaphore
    0.06
    0.06
     triển
    0.06
     ساخته
    0.06
    Act Density 0.047%

    No Known Activations