INDEX
    Explanations
    No Explanations Found
    New Auto-Interp
    Negative Logits
    al
    1.01
    e
    0.95
    is
    0.94
    t
    0.92
    as
    0.90
    us
    0.89
     Clipping
    0.88
    er
    0.87
    ת
    0.87
    ism
    0.86
    POSITIVE LOGITS
     ALSO
    0.79
    0.76
     uomini
    0.73
     여기서
    0.72
     fromage
    0.72
    لقة
    0.72
     graisse
    0.71
     boissons
    0.70
     пъ
    0.70
     spécialistes
    0.70
    Act Density 0.000%

    No Known Activations

    This feature has no known activations.