INDEX
    Explanations

    recalling details, context, or facts

    New Auto-Interp
    Negative Logits
    \
    1.27
    ف
    0.96
    '
    0.93
    ج
    0.91
    ive
    0.79
     amplio
    0.79
     anisotrop
    0.78
    ?
    0.77
    à
    0.76
    ä
    0.76
    POSITIVE LOGITS
    d
    1.10
    ר
    1.05
    ur
    1.01
    h
    1.01
    ת
    1.01
    м
    1.00
    z
    0.97
    ర్
    0.96
    AK
    0.96
    0.96
    Act Density 0.062%

    No Known Activations