INDEX
    Explanations

    instances of the words "hot" and "warmed" related to topics of health or cooking

    New Auto-Interp
    Negative Logits
     Downs
    -0.16
     Deutsch
    -0.15
    alth
    -0.15
     Poke
    -0.15
    olume
    -0.13
    uddy
    -0.13
    Adds
    -0.13
    oundary
    -0.13
     tul
    -0.13
    æ
    -0.13
    POSITIVE LOGITS
    êµ´
    0.15
    bsd
    0.15
    VRTX
    0.15
    ë¡Ŀ
    0.15
    lique
    0.14
    RelativeTo
    0.14
    ì£
    0.14
     ÙĨس
    0.14
    trinsic
    0.14
    vÃŃc
    0.14
    Act Density 0.006%

    No Known Activations