INDEX
    Explanations

    Russian negation particle "Не"

    New Auto-Interp
    Negative Logits
    𖤍
    -2.56
     anderen
    -2.44
    .”
    -2.42
    -2.42
     arbeta
    -2.38
    -2.34
    -2.34
    -2.31
    -2.30
    of
    -2.28
    POSITIVE LOGITS
    ization
    2.39
     They
    2.38
     Не
    2.28
    2.28
     mereka
    2.27
    2.25
    身体
    2.22
    It
    2.19
     Ві
    2.17
    There
    2.16
    Act Density 0.002%

    No Known Activations