INDEX
    Explanations

    Constructivism

    New Auto-Interp
    Negative Logits
    -0.08
    ugh
    -0.08
     occhi
    -0.08
     tenor
    -0.07
    -0.07
     Cad
    -0.07
    Tem
    -0.07
     abusive
    -0.07
    ikh
    -0.07
    lady
    -0.07
    POSITIVE LOGITS
     concur
    0.08
     propagated
    0.08
     totes
    0.08
     propagation
    0.08
    0.08
     woont
    0.08
     welcomed
    0.08
    алин
    0.07
    վեն
    0.07
     아니
    0.07
    Act Density 0.003%

    No Known Activations