INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    Gov
    -0.07
     detection
    -0.07
    itation
    -0.07
     Gov
    -0.07
     Kits
    -0.07
     Detection
    -0.06
    agate
    -0.06
     Drake
    -0.06
     floats
    -0.06
    ни
    -0.06
    POSITIVE LOGITS
    rophe
    0.07
    CodeAt
    0.07
     Prelude
    0.07
    /perl
    0.07
     ευ
    0.06
     creating
    0.06
    .LEADING
    0.06
     مف
    0.06
     (?,
    0.06
    лені
    0.06
    Act Density 0.003%

    No Known Activations