INDEX
    Explanations

    references to toxicity and poisonous substances

    New Auto-Interp
    Negative Logits
    BagLayout
    -0.39
    WriteLiteral
    -0.33
     Robe
    -0.33
     shawl
    -0.32
     Burt
    -0.32
     befe
    -0.32
     paille
    -0.32
     Torn
    -0.31
    Burt
    -0.31
     tackles
    -0.30
    POSITIVE LOGITS
     poison
    1.52
    poison
    1.43
    Poison
    1.38
     poisoning
    1.34
     Poison
    1.34
     poisonous
    1.33
     toxic
    1.31
     toxicity
    1.30
     Toxic
    1.27
    toxic
    1.24
    Act Density 0.570%

    No Known Activations