INDEX
    Explanations

    comments and documentation in code

    New Auto-Interp
    Negative Logits
    ftagPool
    -0.74
    MLLoader
    -0.73
     queſta
    -0.70
    ConstraintMaker
    -0.65
    tagPool
    -0.64
     autorytatywna
    -0.63
    awtextra
    -0.61
     wikipagina
    -0.61
     sumpay
    -0.60
     avoient
    -0.60
    POSITIVE LOGITS
    ":"",
    0.42
    permitAll
    0.39
     Fior
    0.34
     Sheehan
    0.34
     {}
    0.34
    fmt
    0.33
    Laughs
    0.32
     Craig
    0.32
     basta
    0.32
     reason
    0.32
    Act Density 0.401%

    No Known Activations