INDEX
    Explanations

    programming identifiers

    New Auto-Interp
    Negative Logits
    いて
    1.05
    1
    1.02
    ku
    0.98
     at
    0.94
    مر
    0.93
    いる
    0.92
    ED
    0.91
     that
    0.91
    0.88
    rat
    0.88
    POSITIVE LOGITS
    ו
    1.31
    ни
    1.27
    n
    1.16
    1.16
    ти
    1.13
    و
    1.12
    ي
    1.11
    1.06
    1.01
    1.01
    Act Density 0.342%

    No Known Activations