INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     Φ
    0.38
     Helix
    0.36
     shelling
    0.36
     sondern
    0.36
     بدون
    0.36
     পী
    0.35
     Tart
    0.35
     Λ
    0.35
     stallion
    0.35
     Chopin
    0.35
    POSITIVE LOGITS
     how
    1.36
     what
    1.14
     cómo
    1.08
    how
    1.05
     bagaimana
    1.05
     hvordan
    1.04
     why
    1.01
    what
    0.99
     কিভাবে
    0.97
    如何
    0.96
    Act Density 0.270%

    No Known Activations