INDEX
    Explanations

    mentions of different types of candies

    New Auto-Interp
    Negative Logits
    yon
    -0.69
    lihood
    -0.68
    inen
    -0.66
    productive
    -0.65
    Published
    -0.64
    chron
    -0.62
    heed
    -0.61
    iets
    -0.61
     transcript
    -0.61
     Hutchinson
    -0.61
    POSITIVE LOGITS
     cane
    1.12
     candy
    0.94
    strip
    0.93
    mallow
    0.92
    flake
    0.88
     wra
    0.85
    bucks
    0.84
    bar
    0.83
     sweets
    0.81
    weet
    0.81
    Act Density 0.032%

    No Known Activations