INDEX
    Explanations

    negations and phrases indicating something does not occur or is not true

    New Auto-Interp
    Negative Logits
     increí
    -0.90
     Geiſt
    -0.87
     dieſe
    -0.86
     Geſch
    -0.85
     يتيمه
    -0.85
     ſeine
    -0.83
     müſſen
    -0.82
     miniaturka
    -0.80
     témoig
    -0.80
     zuſammen
    -0.79
    POSITIVE LOGITS
    .
    0.67
    0.66
    0.63
    <bos>
    0.60
    ↵↵
    0.56
     "
    0.56
    ,
    0.54
     (
    0.51
        
    0.47
    :
    0.47
    Act Density 0.241%

    No Known Activations