INDEX
    Explanations

    references to food and dishes, especially related to apples and desserts

    New Auto-Interp
    Negative Logits
    itiveness
    -0.86
    rous
    -0.70
    Rub
    -0.69
    raq
    -0.67
    ivas
    -0.66
    jee
    -0.65
    roads
    -0.63
    Greek
    -0.62
    agos
    -0.62
    oshop
    -0.62
    POSITIVE LOGITS
     opposed
    1.32
     well
    1.08
    ynchron
    1.06
     soon
    1.02
    part
    0.97
     part
    0.94
    pired
    0.92
    well
    0.88
     shown
    0.86
    semble
    0.82
    Act Density 0.165%

    No Known Activations