INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     offenses
    -0.07
    _board
    -0.06
     eth
    -0.06
    512
    -0.06
     mc
    -0.06
    -0.06
     nationalists
    -0.06
     lids
    -0.06
     placebo
    -0.06
    -0.06
    POSITIVE LOGITS
     Ventura
    0.06
    ucción
    0.06
     ấn
    0.06
     vatandaş
    0.06
    aal
    0.06
     olmuş
    0.06
    ورات
    0.06
    orses
    0.06
    0.06
    ifestyle
    0.06
    Act Density 0.035%

    No Known Activations