INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    >B
    -0.07
    -0.06
    Transaction
    -0.06
    _PATCH
    -0.06
     characterization
    -0.06
    _NR
    -0.06
    Grammar
    -0.06
     withdrawing
    -0.06
     Enterprise
    -0.06
    _SEQUENCE
    -0.06
    POSITIVE LOGITS
     voc
    0.07
     nomin
    0.07
    bc
    0.07
     дог
    0.07
     incon
    0.07
    0.06
     kommun
    0.06
     ev
    0.06
     rte
    0.06
     radios
    0.06
    Act Density 0.024%

    No Known Activations