INDEX
    Explanations

    mentions of different types of fruits

    mentions of the word "fruit."

    New Auto-Interp
    Negative Logits
     Standing
    -0.73
    agonists
    -0.68
    nee
    -0.65
     Century
    -0.62
    citizens
    -0.62
     Sioux
    -0.61
    Rated
    -0.60
    uled
    -0.60
     silenced
    -0.60
    rupulous
    -0.59
    POSITIVE LOGITS
    fruit
    1.40
     fruit
    1.39
     fruits
    1.25
     juice
    1.03
     Fruit
    1.01
    ruit
    1.01
    ruits
    0.98
    cake
    0.95
    cakes
    0.93
     mango
    0.82
    Act Density 0.017%

    No Known Activations