INDEX
    Explanations

    mentions of food and dining-related terms

    references to food or dining options

    New Auto-Interp
    Negative Logits
    ultan
    -0.85
    icus
    -0.82
    orial
    -0.78
    acca
    -0.78
    iversal
    -0.70
    izations
    -0.70
    ieth
    -0.70
    ulating
    -0.69
     interf
    -0.69
    ulkan
    -0.69
    POSITIVE LOGITS
     fare
    1.08
     fares
    1.01
    ttes
    0.94
    fare
    0.81
    well
    0.78
    ptin
    0.76
    bill
    0.74
    rer
    0.72
    jit
    0.72
    ways
    0.72
    Act Density 0.007%

    No Known Activations