INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    é
    1.05
    ו
    0.96
    í
    0.91
    0.89
    IB
    0.87
    IO
    0.86
    IE
    0.86
    ή
    0.85
    I
    0.83
    л
    0.83
    POSITIVE LOGITS
     archetype
    0.95
    clientes
    0.84
    ت
    0.83
    aule
    0.81
    كة
    0.80
    ri
    0.80
    chaft
    0.80
    pieza
    0.79
    chre
    0.79
    一种
    0.79
    Act Density 0.004%

    No Known Activations