INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    ristol
    -0.06
    Wal
    -0.06
    Unified
    -0.06
    urgery
    -0.06
     guarding
    -0.06
     Bac
    -0.06
    ellig
    -0.06
     tariff
    -0.05
    دانلود
    -0.05
     Goods
    -0.05
    POSITIVE LOGITS
     ECB
    0.07
    "};↵↵
    0.07
     напис
    0.07
     triple
    0.07
    CHANT
    0.06
    .spec
    0.06
    ="'.
    0.06
    δες
    0.06
    HomeAsUp
    0.06
    .parser
    0.06
    Act Density 0.007%

    No Known Activations