INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    byn
    -0.08
    ców
    -0.08
    μών
    -0.08
    GIN
    -0.08
     marketer
    -0.07
    -0.07
    τει
    -0.07
     jire
    -0.07
    جات
    -0.07
     oss
    -0.07
    POSITIVE LOGITS
     Assume
    0.08
     Additionally
    0.08
     categorie
    0.08
    ."""↵↵
    0.08
     Setup
    0.08
     """↵↵
    0.08
     نفر
    0.08
     खाली
    0.08
     Initialize
    0.08
     outline
    0.07
    Act Density 0.057%

    No Known Activations