INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     رض
    -0.08
     novels
    -0.06
    もの
    -0.06
     subj
    -0.06
     INTERNATIONAL
    -0.06
    -0.06
     friendly
    -0.06
     approximation
    -0.06
     Casual
    -0.06
     Nice
    -0.06
    POSITIVE LOGITS
     taste
    0.07
    -ли
    0.07
     NodeType
    0.07
     itemType
    0.07
     setCurrent
    0.07
    (sort
    0.06
     setEmail
    0.06
     payout
    0.06
    toContain
    0.06
    (TAG
    0.06
    Act Density 0.004%

    No Known Activations