INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     inherits
    -0.08
    -0.07
    Tuesday
    -0.07
    -0.07
     burns
    -0.07
     crashes
    -0.06
     poor
    -0.06
     Level
    -0.06
     razor
    -0.06
     reluctance
    -0.06
    POSITIVE LOGITS
    (hero
    0.07
    0.07
    있는
    0.06
    mino
    0.06
    (boolean
    0.06
     ero
    0.06
    weed
    0.06
     spiritually
    0.06
    ':'
    0.06
     gül
    0.06
    Act Density 0.006%

    No Known Activations