INDEX
    Explanations

    reset, clear, angle, defy

    New Auto-Interp
    Negative Logits
     bevat
    0.54
    يف
    0.52
     führen
    0.50
     ge
    0.48
     Conce
    0.47
    中文
    0.47
    تع
    0.47
     الع
    0.46
     internas
    0.46
     Re
    0.46
    POSITIVE LOGITS
    neurons
    0.58
    soldiers
    0.58
    सर्गिक
    0.55
    amyl
    0.54
    ow
    0.53
     seedlings
    0.53
     strolled
    0.53
    slow
    0.52
     vultures
    0.52
    sliced
    0.52
    Act Density 0.002%

    No Known Activations