INDEX
    Explanations

    mentions of flaws or mistakes

    New Auto-Interp
    Negative Logits
     shortcomings
    -0.92
     CURIAM
    -0.76
    __":
    
    -0.71
     deficiencies
    -0.69
     linkovi
    -0.68
    Filmographie
    -0.66
     sumpay
    -0.64
     bucket
    -0.61
     otomatig
    -0.61
     OCCUP
    -0.60
    POSITIVE LOGITS
     flaw
    2.00
     flaws
    1.44
     flawed
    1.22
     flawless
    1.15
    airo
    0.79
     flawlessly
    0.79
     Blas
    0.72
    Blas
    0.65
    dele
    0.64
    fla
    0.61
    Act Density 0.001%

    No Known Activations