INDEX
    Explanations

    instances of the word "nothing."

    New Auto-Interp
    Negative Logits
     alles
    -0.15
    enu
    -0.14
    438
    -0.14
     aren
    -0.14
    ØŃØ©
    -0.14
    üven
    -0.14
    ial
    -0.13
    serter
    -0.13
    posit
    -0.13
    917
    -0.13
    POSITIVE LOGITS
     else
    0.35
    ness
    0.28
    else
    0.25
     ELSE
    0.23
    Else
    0.22
    burger
    0.21
    _else
    0.21
     Else
    0.21
     wrong
    0.21
    /no
    0.20
    Act Density 0.048%

    No Known Activations