INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     stylesheet
    -0.08
     expects
    -0.08
     subsidy
    -0.08
     conventions
    -0.08
    Naming
    -0.08
     lubricant
    -0.08
    לים
    -0.08
     stylist
    -0.08
     leef
    -0.07
     styling
    -0.07
    POSITIVE LOGITS
     grilled
    0.11
     tacos
    0.10
     pancakes
    0.09
     burgers
    0.09
     Sushi
    0.09
     salads
    0.09
     gourmet
    0.08
     billet
    0.08
     sushi
    0.08
     appetizers
    0.08
    Act Density 0.014%

    No Known Activations