INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     Dash
    -0.08
     DASH
    -0.08
    Dash
    -0.07
     Blatt
    -0.07
     SYMBOL
    -0.07
     Silent
    -0.07
     utter
    -0.07
    /reference
    -0.07
     Ruhe
    -0.07
    dash
    -0.07
    POSITIVE LOGITS
     increasingly
    0.09
     sağlar
    0.08
     developing
    0.08
     progresses
    0.08
     inim
    0.08
    越来越
    0.08
    creasing
    0.08
    0.08
     progressively
    0.08
     progressing
    0.08
    Act Density 0.025%

    No Known Activations