INDEX
    Explanations

    articles/prepositions

    New Auto-Interp
    Negative Logits
     Undergraduate
    -0.08
     beschikken
    -0.08
     starters
    -0.08
     undergraduate
    -0.08
     văn
    -0.08
    morgen
    -0.07
     amps
    -0.07
    kund
    -0.07
    -conscious
    -0.07
     scher
    -0.07
    POSITIVE LOGITS
     ZERO
    0.10
     zeros
    0.09
    .ZERO
    0.09
    angler
    0.09
     zéro
    0.08
     repeated
    0.08
     Zero
    0.08
     zero
    0.08
    _ZERO
    0.08
     lowercase
    0.08
    Act Density 0.006%

    No Known Activations