INDEX
    Explanations

    The neuron fires on words that signal physical harm, danger, or injury.

    New Auto-Interp
    Negative Logits
     desarrollo
    -0.06
     نگ
    -0.06
     الشم
    -0.06
    يد
    -0.06
    :path
    -0.06
    ustainable
    -0.06
    iman
    -0.06
     convertible
    -0.06
     Роз
    -0.05
    uyễn
    -0.05
    POSITIVE LOGITS
    BTTagCompound
    0.07
     graceful
    0.07
    .functions
    0.07
    erus
    0.06
     phenotype
    0.06
    .*(
    0.06
    (batch
    0.06
    -big
    0.06
    .':
    0.06
    _LINK
    0.06
    Act Density 0.022%

    No Known Activations