INDEX
    Explanations

    word beginnings (T, S, V, H, C, E, S, S, S)

    New Auto-Interp
    Negative Logits
    ש
    0.84
    et
    0.65
    ي
    0.60
    en
    0.58
    u
    0.58
    ס
    0.57
    a
    0.56
     be
    0.55
     Psalm
    0.55
    el
    0.55
    POSITIVE LOGITS
    0.55
    ுள்ளனர்
    0.54
     ومع
    0.53
    0.53
    َل
    0.51
    pubescens
    0.50
    orthogonal
    0.48
    attering
    0.48
    0.48
    ING
    0.48
    Act Density 0.270%

    No Known Activations