INDEX
    Explanations

    letters to the editor

    New Auto-Interp
    Negative Logits
     MOT
    -0.08
     intern
    -0.07
     Mot
    -0.07
     \
    -0.07
    ł
    -0.07
     wore
    -0.07
     \(
    -0.07
     perfumes
    -0.07
     cope
    -0.07
    !\
    -0.06
    POSITIVE LOGITS
     rebut
    0.09
     можем
    0.09
     citoyens
    0.09
     građ
    0.08
     volgen
    0.08
     Pitts
    0.08
     Fence
    0.08
     aanbevel
    0.08
     интересно
    0.08
     добров
    0.08
    Act Density 0.015%

    No Known Activations