INDEX
    Explanations

    mission statement variants

    New Auto-Interp
    Negative Logits
    '
    1.09
    د
    0.82
    ш
    0.81
    с
    0.77
    𝙙
    0.77
    ким
    0.73
    ями
    0.73
    0.73
    ش
    0.71
    0.71
    POSITIVE LOGITS
     for
    1.01
    ம்
    0.94
    n
    0.87
    m
    0.81
    u
    0.80
    arın
    0.78
    is
    0.76
    nél
    0.75
    arh
    0.74
    ro
    0.72
    Act Density 0.014%

    No Known Activations