INDEX
    Explanations

    mentions of food-related topics and consumption patterns

    "food" followed by descriptive words

    New Auto-Interp
    Negative Logits
    ]";
    -1.01
    )"),
    -0.99
    )");
    
    -0.95
    ."]
    -0.92
    BibitemShut
    -0.92
    \"");
    -0.92
    ']],
    -0.91
    ."),
    -0.91
    ])]
    -0.90
    ."</
    -0.90
    POSITIVE LOGITS
    -
    0.66
    y
    0.63
    z
    0.63
    w
    0.63
    H
    0.62
    f
    0.62
    b
    0.61
    P
    0.60
    man
    0.59
    k
    0.59
    Act Density 0.577%

    No Known Activations