INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    خبر
    -0.07
    essel
    -0.07
     chaos
    -0.07
    Chunk
    -0.06
     chuyển
    -0.06
    demo
    -0.06
     WINDOW
    -0.06
     cylinder
    -0.06
     kep
    -0.06
     уров
    -0.06
    POSITIVE LOGITS
     faith
    0.19
     Faith
    0.16
    faith
    0.10
    Fa
    0.08
    ah
    0.08
    -da
    0.07
     faithfully
    0.07
    ith
    0.07
     Tài
    0.07
    Tai
    0.07
    Act Density 0.004%

    No Known Activations