INDEX
    Explanations

    ellipses and discontinuities in text

    New Auto-Interp
    Negative Logits
    astify
    -0.48
    y
    -0.45
    -
    -0.42
    bawa
    -0.40
     näm
    -0.40
    '];
    
    -0.40
     Putih
    -0.39
    -0.38
     Sünde
    -0.38
     ferons
    -0.38
    POSITIVE LOGITS
    ...),
    1.12
    ...)
    1.11
    ...).
    1.02
    ...?"
    1.00
    ...,
    0.98
     ...)
    0.98
    ...?
    0.98
    ...",
    0.94
    …)
    0.93
    ...".
    0.93
    Act Density 0.027%

    No Known Activations