INDEX
    Explanations

    mentioning documentation or projects

    New Auto-Interp
    Negative Logits
     простой
    0.45
    単純
    0.45
    Upon
    0.40
    Simple
    0.39
     simple
    0.38
     severely
    0.37
    uson
    0.36
     acquire
    0.36
    simple
    0.36
     undetected
    0.36
    POSITIVE LOGITS
     mention
    1.91
     언급
    1.80
     mentions
    1.73
     mentioning
    1.67
     Mention
    1.64
    Mention
    1.63
    mention
    1.62
     упомина
    1.62
    提及
    1.48
     mencionar
    1.45
    Act Density 0.049%

    No Known Activations