INDEX
    Explanations

    warnings and safety concerns regarding children and small objects

    New Auto-Interp
    Negative Logits
    Guy
    -0.16
    èĤ©
    -0.16
    Leaks
    -0.15
    vir
    -0.15
     Shack
    -0.15
    ilon
    -0.14
    vasion
    -0.14
    ñana
    -0.14
    reon
    -0.14
    emand
    -0.14
    POSITIVE LOGITS
     dangerous
    0.22
     unsafe
    0.21
    Unsafe
    0.21
     danger
    0.20
    unsafe
    0.19
     children
    0.19
     safety
    0.18
     Dangerous
    0.18
     safer
    0.17
     dangers
    0.17
    Act Density 0.050%

    No Known Activations