INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     sliced
    -0.08
     mellitus
    -0.08
     amplifier
    -0.07
     healed
    -0.07
    umnos
    -0.07
     solved
    -0.07
    agrama
    -0.07
     meld
    -0.07
    łość
    -0.07
     trape
    -0.07
    POSITIVE LOGITS
     Crawl
    0.10
     crawling
    0.10
    0.10
     Craw
    0.10
     crawl
    0.10
    .walk
    0.09
     craw
    0.09
     crawler
    0.09
    Crawler
    0.09
    0.09
    Act Density 0.002%

    No Known Activations