INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     Leistung
    -0.08
    -0.08
    rer
    -0.08
     obrig
    -0.07
     расчет
    -0.07
     ביום
    -0.07
    utsa
    -0.07
     Vargas
    -0.07
     Einzahlung
    -0.07
     (__
    -0.07
    POSITIVE LOGITS
     scholars
    0.08
     jedn
    0.07
     barbar
    0.07
     pilgrims
    0.07
     offerings
    0.07
     labour
    0.07
     ẹya
    0.07
    0.07
    brates
    0.07
     flore
    0.07
    Act Density 0.001%

    No Known Activations