INDEX
    Explanations

    words indicating composition or structure

    New Auto-Interp
    Negative Logits
     consistent
    -0.83
    #>
    -0.69
    consistent
    -0.68
     rekening
    -0.66
     parem
    -0.64
     Klagen
    -0.64
     sobra
    -0.64
     Consistent
    -0.64
     QLabel
    -0.62
     roh
    -0.60
    POSITIVE LOGITS
     consists
    1.23
     consisted
    1.17
     consisting
    1.09
     consist
    0.94
     terdiri
    0.94
    sists
    0.84
     Hecht
    0.78
    setopt
    0.78
     zi
    0.76
     démocr
    0.75
    Act Density 0.085%

    No Known Activations