INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    енной
    -0.08
     пах
    -0.08
    enzyme
    -0.08
    .en
    -0.08
     formatted
    -0.08
     Moist
    -0.08
     Puzzle
    -0.07
    _entropy
    -0.07
    .Sequential
    -0.07
    енную
    -0.07
    POSITIVE LOGITS
     casualties
    0.15
     safety
    0.15
    安全
    0.14
     безопасность
    0.14
     안전
    0.13
     veiligheid
    0.13
    Safety
    0.13
     fatalities
    0.12
     സുരക്ഷ
    0.12
     bezpieczeń
    0.12
    Act Density 0.079%

    No Known Activations