INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    _FC
    -0.07
     spectral
    -0.06
    plode
    -0.06
    	Use
    -0.06
     Catalyst
    -0.06
     tôn
    -0.06
    (['/
    -0.06
     Sutton
    -0.06
    ालन
    -0.06
    }*
    -0.06
    POSITIVE LOGITS
     leftover
    0.14
     leftovers
    0.10
    ő
    0.06
     venture
    0.06
     refund
    0.06
    ijn
    0.06
     supplementary
    0.06
    -total
    0.06
    ้าม
    0.06
    olynomial
    0.06
    Act Density 0.002%

    No Known Activations