INDEX
    Explanations
    No Explanations Found
    New Auto-Interp
    Negative Logits
     so
    -0.80
     undermines
    -0.79
     also
    -0.77
     humiliating
    -0.77
     apparently
    -0.76
     couldn
    -0.75
     thwarted
    -0.74
     were
    -0.73
     dumbfounded
    -0.73
     they
    -0.73
    POSITIVE LOGITS
    <bos>
    9.69
     dispen
    1.90
    GEBURTSDATUM
    1.85
     fta
    1.77
     effe
    1.75
     desir
    1.74
     squa
    1.74
     ftu
    1.70
     nutr
    1.69
     fto
    1.69
    Act Density 0.000%

    No Known Activations

    This feature has no known activations.