INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    is
    1.02
    ك
    0.93
    ین
    0.91
    л
    0.80
    k
    0.80
    غ
    0.78
    าว
    0.78
    ंग
    0.77
    のために
    0.76
    ズル
    0.76
    POSITIVE LOGITS
     pork
    1.02
     porcine
    0.87
    0.85
    I
    0.84
     Pork
    0.83
    ри
    0.80
     boar
    0.79
     piglets
    0.79
     ferramenta
    0.78
    V
    0.77
    Act Density 0.012%

    No Known Activations