INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     in
    1.16
     the
    1.05
     في
    0.77
    is
    0.76
    0.73
    ;
    0.71
     auteur
    0.70
     ABV
    0.69
     is
    0.68
     attire
    0.66
    POSITIVE LOGITS
    1.14
    ون
    0.93
    いを
    0.92
    t
    0.88
    0.87
    на
    0.87
    0.86
    0.86
    ла
    0.84
    いい
    0.83
    Act Density 0.536%

    No Known Activations