INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     непосредственно
    0.43
     atk
    0.42
     ayudan
    0.42
     allein
    0.40
     representando
    0.40
     erfol
    0.39
     representing
    0.38
     নিজে
    0.38
     quinquina
    0.38
     beak
    0.38
    POSITIVE LOGITS
    經歷
    0.52
     embraces
    0.51
     устро
    0.51
     obsessed
    0.49
     undergo
    0.48
     loves
    0.47
     mengalami
    0.47
     obses
    0.47
    における
    0.46
     tolerate
    0.46
    Act Density 0.022%

    No Known Activations