INDEX
    Explanations

    closing tags and punctuation

    New Auto-Interp
    Negative Logits
    ار
    0.62
    m
    0.54
    ن
    0.54
     on
    0.52
    s
    0.51
    0.46
    0.45
    ri
    0.45
     of
    0.44
     as
    0.44
    POSITIVE LOGITS
    ?
    0.67
    ה
    0.47
    ه
    0.46
    !
    0.45
     revital
    0.44
    </strong>
    0.42
    0.42
    0.39
    </h3>
    0.38
    ig
    0.38
    Act Density 0.396%

    No Known Activations