INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    */].
    -0.43
    Princip
    -0.42
    itects
    -0.42
     tjen
    -0.41
     Principals
    -0.41
     Cactus
    -0.41
    Eins
    -0.40
     MwSt
    -0.39
    Tanz
    -0.38
     fibroblast
    -0.38
    POSITIVE LOGITS
     over
    1.15
     Over
    1.09
    over
    1.09
     OVER
    1.04
    Over
    0.98
    OVER
    0.89
    kover
    0.85
     över
    0.79
     über
    0.78
    tover
    0.72
    Act Density 0.057%

    No Known Activations