INDEX
    Explanations

    `:` followed by names or file

    New Auto-Interp
    Negative Logits
    t
    1.13
    ing
    1.12
    L
    0.93
     at
    0.93
    ت
    0.91
    0.90
     -
    0.87
    ر
    0.87
    ty
    0.85
    d
    0.81
    POSITIVE LOGITS
    ،
    1.19
    1.01
    0.99
    0.96
    0.90
     propuestas
    0.86
    0.82
    0.79
    0.78
    ור
    0.77
    Act Density 0.036%

    No Known Activations