INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     on
    1.15
     is
    1.09
    ية
    0.89
     was
    0.89
    ti
    0.88
    </h2>
    0.87
     الغ
    0.86
    leştir
    0.86
     المُ
    0.84
    0.84
    POSITIVE LOGITS
    1.26
    0
    1.23
    ו
    1.19
    it
    1.16
    ן
    1.16
    the
    1.15
    ם
    1.14
    Identify
    1.13
    1.11
    ro
    1.08
    Act Density 0.039%

    No Known Activations