INDEX
    Explanations

    Ara h, Vicuna, Man Hemlock, HER2, Sbarro, H

    New Auto-Interp
    Negative Logits
    ل
    0.91
    ون
    0.88
    ج
    0.85
    માં
    0.79
    ق
    0.77
    ע
    0.75
    ку
    0.75
    ف
    0.74
    ח
    0.74
    in
    0.73
    POSITIVE LOGITS
     
    0.79
     to
    0.60
    0.51
     was
    0.49
     piccoli
    0.48
     bessere
    0.48
    $,
    0.47
     Çok
    0.47
     Но
    0.46
     był
    0.46
    Act Density 0.268%

    No Known Activations