INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    '
    0.29
    =
    0.28
    ^
    0.27
     fatores
    0.27
    Genres
    0.27
    ،
    0.25
    Factors
    0.25
    );
    0.25
    Manus
    0.25
    Question
    0.25
    POSITIVE LOGITS
    in
    0.43
    t
    0.41
    ar
    0.40
    at
    0.38
    n
    0.36
    ر
    0.35
    er
    0.35
    ado
    0.34
    h
    0.34
    r
    0.33
    Act Density 0.678%

    No Known Activations