INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    0.82
     fidel
    0.80
    kit
    0.73
     ندارد
    0.73
    kilometer
    0.73
    นต์
    0.73
    0.72
    extended
    0.72
    ifend
    0.72
    💮
    0.71
    POSITIVE LOGITS
     Sexuality
    0.96
    ເວລາ
    0.90
     Queen
    0.87
     ktorý
    0.87
    ោធ
    0.86
     metastability
    0.85
     Empowerment
    0.85
    0.84
     metastable
    0.84
     中心
    0.84
    Act Density 0.012%

    No Known Activations