INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    forName
    -0.96
     overall
    -0.94
    čnosti
    -0.94
     for
    -0.93
    ocks
    -0.90
    -0.88
    itig
    -0.88
    tehen
    -0.88
     сторон
    -0.88
    ências
    -0.87
    POSITIVE LOGITS
     lens
    1.73
     channels
    1.72
     medium
    1.72
     посред
    1.70
     mediums
    1.67
     mechanism
    1.52
     auspices
    1.48
     biais
    1.46
     prism
    1.38
     means
    1.36
    Act Density 0.075%

    No Known Activations