INDEX
    Explanations

    references to potential choking hazards

    New Auto-Interp
    Negative Logits
    istrovstvÃŃ
    -0.17
    ril
    -0.17
    tl
    -0.14
    leurs
    -0.14
    rij
    -0.14
    زاÙĨ
    -0.14
     sadd
    -0.14
    æ±Ĺ
    -0.14
    μά
    -0.13
     dein
    -0.13
    POSITIVE LOGITS
     swallowing
    0.40
     gag
    0.39
     choking
    0.38
     swallowed
    0.35
     throat
    0.35
     swallow
    0.34
     choked
    0.31
     choke
    0.31
    throat
    0.30
    å
    0.29
    Act Density 0.074%

    No Known Activations