INDEX
    Explanations

    terminal punctuation marks

    New Auto-Interp
    Negative Logits
    s
    0.40
    ים
    0.37
    ו
    0.35
    اٹ
    0.35
     operations
    0.34
    ್ಯಾ
    0.34
     נישט
    0.33
    t
    0.33
     motors
    0.33
    ાલુ
    0.33
    POSITIVE LOGITS
    0.64
    0.46
    0.43
    .
    0.43
    .,
    0.41
    ).
    0.40
    .;
    0.39
    0.39
    .)
    0.38
    ؛
    0.37
    Act Density 0.133%

    No Known Activations