INDEX
    Explanations

    references to cakes and baking processes

    New Auto-Interp
    Negative Logits
     Fireplace
    -0.16
     Chili
    -0.15
    _bin
    -0.14
    Soup
    -0.14
     chili
    -0.14
    idi
    -0.14
    Criterion
    -0.14
    ñana
    -0.14
    алов
    -0.14
     Sauce
    -0.13
    POSITIVE LOGITS
     cake
    0.45
     cakes
    0.39
     Cake
    0.37
    cake
    0.37
    Cake
    0.32
    cakes
    0.31
     birthday
    0.27
    ecake
    0.26
     tiers
    0.25
    cak
    0.25
    Act Density 0.031%

    No Known Activations