INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     POLITICO
    -0.08
     podp
    -0.07
     Solic
    -0.06
     Properties
    -0.06
     muschi
    -0.06
    Registration
    -0.06
    /student
    -0.06
     bonne
    -0.06
     वस
    -0.06
     masse
    -0.06
    POSITIVE LOGITS
    umbled
    0.08
    Av
    0.07
     serving
    0.07
    Ul
    0.07
     birds
    0.07
     mileage
    0.07
     swiftly
    0.06
    lac
    0.06
    Typography
    0.06
    يط
    0.06
    Act Density 0.005%

    No Known Activations