INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     tackle
    0.48
     حمله
    0.48
     Aprili
    0.45
    0.45
     arah
    0.44
     soluzioni
    0.43
     ها
    0.43
     delicacy
    0.43
     Hail
    0.42
    ربي
    0.41
    POSITIVE LOGITS
    SELECT
    0.80
    Selective
    0.71
     selects
    0.69
    deselect
    0.69
    Select
    0.68
     Select
    0.68
    selecting
    0.66
     SELECT
    0.66
     selective
    0.64
     select
    0.63
    Act Density 0.014%

    No Known Activations