INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    -0.07
    _tag
    -0.07
    Republicans
    -0.07
     Dynamic
    -0.06
    -proof
    -0.06
     Broadcast
    -0.06
    ogene
    -0.06
    Proj
    -0.06
     même
    -0.06
     conservative
    -0.06
    POSITIVE LOGITS
     bere
    0.08
     promptly
    0.07
    0.07
    0.07
     mich
    0.07
    _DIS
    0.06
     advert
    0.06
    //------------------------------------------------------------------------------------------------
    0.06
     Kew
    0.06
    0.06
    Act Density 0.014%

    No Known Activations