INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    anced
    -0.07
     communism
    -0.06
    construction
    -0.06
     конечно
    -0.06
    -0.06
    -0.06
    -0.06
     Pon
    -0.06
     Кор
    -0.05
    notin
    -0.05
    POSITIVE LOGITS
    -close
    0.07
    association
    0.06
    0.06
     isr
    0.06
    uba
    0.06
    UBL
    0.06
    _Source
    0.06
    alt
    0.06
    >p
    0.06
     glue
    0.06
    Act Density 0.012%

    No Known Activations