INDEX
    Explanations

    references to candy, particularly candy corn and its manufacturers

    New Auto-Interp
    Negative Logits
     nahilalakip
    -0.65
    enderror
    -0.62
     ujednoznacz
    -0.61
    bootstrapcdn
    -0.61
    ſicht
    -0.60
     deſſen
    -0.59
     erſt
    -0.59
    <pad>
    -0.59
    [@BOS@]
    -0.59
    <unused3>
    -0.59
    POSITIVE LOGITS
     candy
    0.92
     candies
    0.78
    candy
    0.72
    🍬
    0.69
    Candy
    0.68
     Candy
    0.67
    🍭
    0.61
     sweets
    0.57
     confection
    0.56
     Süß
    0.52
    Act Density 0.012%

    No Known Activations