INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     infatti
    -0.09
     unfortunately
    -0.09
     infelizmente
    -0.09
     например
    -0.08
     leider
    -0.08
     начиная
    -0.08
    372
    -0.08
     totiž
    -0.08
     betekenen
    -0.08
     Algem
    -0.07
    POSITIVE LOGITS
     persever
    0.09
    тік
    0.09
    0.08
     perseverance
    0.08
    -around
    0.08
     بتوان
    0.08
    _once
    0.08
     podido
    0.08
     قادر
    0.08
    atea
    0.08
    Act Density 0.031%

    No Known Activations