INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    нулся
    -0.07
    -0.07
     Ecuador
    -0.07
    defs
    -0.07
    BAT
    -0.06
    zik
    -0.06
     Uncategorized
    -0.06
     ave
    -0.06
    -0.06
    _activation
    -0.06
    POSITIVE LOGITS
    Elf
    0.07
     دارای
    0.06
    дн
    0.06
    hoa
    0.06
    0.06
    0.06
    (Math
    0.06
     whether
    0.06
     grenade
    0.05
     """↵
    0.05
    Act Density 0.000%

    No Known Activations