INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     zwölf
    0.75
     două
    0.74
    etop
    0.74
    0.73
     éstas
    0.72
     oček
    0.71
     séptimo
    0.71
    źni
    0.71
    🔟
    0.70
     ۷
    0.70
    POSITIVE LOGITS
    2
    1.30
     and
    1.07
    0.95
    3
    0.91
    מ
    0.89
    ال
    0.88
    л
    0.85
     of
    0.84
     or
    0.81
    ב
    0.81
    Act Density 0.000%

    No Known Activations

    This feature has no known activations.