INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    .Desc
    -0.06
    Spin
    -0.06
     цієї
    -0.06
     newer
    -0.06
    groupon
    -0.06
    ्न
    -0.06
    unprocessable
    -0.06
    	sc
    -0.06
     dobu
    -0.06
    •↵↵
    -0.06
    POSITIVE LOGITS
    nutrition
    0.07
     tyranny
    0.07
     EventHandler
    0.06
    hydration
    0.06
     intrigue
    0.06
    0.06
    ocard
    0.06
    ند
    0.06
    0.06
    ướng
    0.06
    Act Density 0.000%

    No Known Activations