INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     Vitamin
    -0.06
     عرضه
    -0.06
     terminates
    -0.06
     superhero
    -0.06
    avenport
    -0.06
     terminating
    -0.06
     Da
    -0.06
    imit
    -0.06
    COMMAND
    -0.06
     CAT
    -0.06
    POSITIVE LOGITS
     slow
    0.10
     slower
    0.10
     Slow
    0.08
     따른
    0.07
     slows
    0.07
    .subtitle
    0.07
     impover
    0.07
     slowing
    0.07
     silence
    0.07
    940
    0.07
    Act Density 0.012%

    No Known Activations