INDEX
    Explanations

    repeated mentions of "us."

    New Auto-Interp
    Negative Logits
    NewUrlParser
    -0.47
     tille
    -0.42
    joba
    -0.42
     Collo
    -0.41
    proy
    -0.40
    oyi
    -0.40
     Reine
    -0.38
    etan
    -0.38
    ation
    -0.36
     chain
    -0.36
    POSITIVE LOGITS
     us
    1.75
     Us
    1.30
    Us
    1.16
     meille
    1.09
     нам
    0.91
    讓我們
    0.91
    us
    0.90
     нас
    0.90
     ours
    0.90
     nás
    0.90
    Act Density 0.059%

    No Known Activations