INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     familiarity
    -0.06
     Instit
    -0.06
     acclaim
    -0.06
     Records
    -0.06
     drunk
    -0.06
     bump
    -0.06
    multiline
    -0.06
    +xml
    -0.06
     flipped
    -0.06
     emulator
    -0.06
    POSITIVE LOGITS
     idle
    0.07
    ampled
    0.07
     IDR
    0.07
     Shane
    0.07
     humane
    0.07
    ITED
    0.06
     TableView
    0.06
     Giuliani
    0.06
    _cou
    0.06
    ored
    0.06
    Act Density 0.000%

    No Known Activations