INDEX
    Explanations

    mentions of chocolate and dessert-related terms

    New Auto-Interp
    Negative Logits
    atives
    -0.86
    WARD
    -0.82
    umbnail
    -0.73
    kus
    -0.72
    ership
    -0.71
    igate
    -0.71
     Imran
    -0.70
    ATIVE
    -0.70
    inen
    -0.68
     REPORT
    -0.68
    POSITIVE LOGITS
     cake
    0.94
     pudding
    0.91
    anut
    0.89
     chocolate
    0.89
     coated
    0.89
     cane
    0.84
     chip
    0.83
    âĺħâĺħ
    0.82
     flavored
    0.81
     butter
    0.81
    Act Density 7.922%

    No Known Activations