INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     
    0.68
     ١
    0.47
    ↵↵
    0.37
     enfoque
    0.35
     aquest
    0.35
     acest
    0.35
    =
    0.34
     folgender
    0.34
     importanti
    0.34
    8
    0.34
    POSITIVE LOGITS
    ле
    0.41
    to
    0.40
     demás
    0.39
    ى
    0.39
    0.39
    b
    0.38
    л
    0.38
    νο
    0.37
    ικ
    0.37
     থাকিতে
    0.37
    Act Density 2.275%

    No Known Activations