INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    _ITEM
    -0.07
     THAT
    -0.07
    _top
    -0.07
     Adopt
    -0.06
    ________________________________
    -0.06
    plemented
    -0.06
    _Call
    -0.06
    border
    -0.06
    Development
    -0.06
    lymp
    -0.06
    POSITIVE LOGITS
     среди
    0.06
     पक
    0.06
    0.06
     obscured
    0.06
     >/
    0.06
     příč
    0.06
     відпов
    0.06
    0.06
    ա
    0.05
    0.05
    Act Density 0.028%

    No Known Activations