INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     enfrenta
    -0.08
    -0.08
     Wij
    -0.08
    Zw
    -0.08
    hver
    -0.08
     tackles
    -0.08
     связи
    -0.08
    uelve
    -0.08
    Wij
    -0.08
     diza
    -0.08
    POSITIVE LOGITS
     misunderstanding
    0.10
     misunderstood
    0.09
     unknow
    0.09
     subconscious
    0.09
     misleading
    0.09
     deceptive
    0.09
     mistaken
    0.09
     concealed
    0.09
     unnoticed
    0.09
     missed
    0.09
    Act Density 0.097%

    No Known Activations