INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     incom
    -0.07
     fabrication
    -0.07
     checklist
    -0.06
     translation
    -0.06
     Publishers
    -0.06
    .bus
    -0.06
     ZeroConstructor
    -0.06
     QFile
    -0.06
     Throughout
    -0.06
    795
    -0.06
    POSITIVE LOGITS
    _cap
    0.06
    encv
    0.06
    (typeof
    0.06
    0.06
    versations
    0.06
     golf
    0.06
    .integer
    0.06
    amız
    0.06
    _WRONG
    0.06
    Aligned
    0.06
    Act Density 0.120%

    No Known Activations