INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    _POS
    -0.08
    .matches
    -0.07
     cont
    -0.07
     pursued
    -0.07
     Bosch
    -0.07
    coll
    -0.07
     appealing
    -0.07
    announce
    -0.06
    -0.06
    pole
    -0.06
    POSITIVE LOGITS
     типа
    0.07
    ocytes
    0.06
    0.06
     тих
    0.06
    0.06
    OD
    0.05
    [in
    0.05
    activation
    0.05
     dẫn
    0.05
     Christina
    0.05
    Act Density 0.011%

    No Known Activations