INDEX
    Explanations

    code function arguments

    New Auto-Interp
    Negative Logits
     kabar
    0.42
     irgende
    0.39
     ständig
    0.38
    0.37
     prothorax
    0.36
    0.36
     Sainsbury
    0.36
    0.36
     sentir
    0.36
    0.36
    POSITIVE LOGITS
    and
    0.42
    with
    0.39
    ate
    0.37
     for
    0.35
     with
    0.34
    之后
    0.34
     metadata
    0.33
     decrease
    0.33
    _
    0.33
    ond
    0.33
    Act Density 0.024%

    No Known Activations