INDEX
    Explanations

    phrases related to toxicity or harmful substances

    references to toxic substances and their effects

    New Auto-Interp
    Negative Logits
    FORE
    -0.72
    hung
    -0.71
    quart
    -0.70
    Untitled
    -0.69
    zzi
    -0.68
    wright
    -0.68
    ploma
    -0.68
    BO
    -0.66
    bler
    -0.65
    telling
    -0.65
    POSITIVE LOGITS
     masculinity
    1.06
     poisoning
    0.96
     algae
    0.92
     waste
    0.92
    ological
    0.90
     fumes
    0.90
    oxic
    0.89
     substances
    0.89
    ologist
    0.87
    ology
    0.87
    Act Density 0.017%

    No Known Activations