INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     suicidal
    -0.06
     mitigation
    -0.06
     Muham
    -0.06
    ionage
    -0.06
     nu
    -0.06
     당신
    -0.06
     yani
    -0.06
     salsa
    -0.06
     logos
    -0.06
    /Base
    -0.06
    POSITIVE LOGITS
     ايران
    0.07
    /*!↵
    0.06
    VisualStyle
    0.06
    lament
    0.06
     zf
    0.06
    LOSE
    0.06
     Luxury
    0.06
    _classification
    0.06
    .heading
    0.06
     Boys
    0.06
    Act Density 0.045%

    No Known Activations