INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     butter
    -0.08
    Lean
    -0.07
     times
    -0.07
    amb
    -0.07
     latte
    -0.07
    _times
    -0.07
    outline
    -0.07
     "...
    -0.07
    _outline
    -0.07
     lavish
    -0.07
    POSITIVE LOGITS
     invari
    0.12
     conserved
    0.10
     Immutable
    0.10
     invariant
    0.10
    Invariant
    0.09
     hashlib
    0.09
     Conservation
    0.08
     conservation
    0.08
    isul
    0.08
     Inventory
    0.08
    Act Density 0.009%

    No Known Activations