INDEX
    Explanations

    phrases related to directional orientations, particularly "left" and its variations

    New Auto-Interp
    Negative Logits
    rna
    -0.17
    xl
    -0.16
    รà¸Ńà¸ĩ
    -0.16
    ously
    -0.15
    ouflage
    -0.15
    rå
    -0.15
     Ñįлек
    -0.15
    adil
    -0.15
    ร
    -0.15
    ãģĨãģ¡
    -0.15
    POSITIVE LOGITS
    ward
    0.24
    /right
    0.22
    wards
    0.21
    most
    0.21
    -hand
    0.21
    ness
    0.20
    ablish
    0.18
    tings
    0.18
    -wing
    0.17
    s
    0.16
    Act Density 0.052%

    No Known Activations