INDEX
    Explanations

    computational algorithms and circuits

    text that describes or requests bypassing, compromising, or hijacking systems (jailbreaking/hacking-style instructions).

    New Auto-Interp
    Negative Logits
     throwIfNotFound
    0.46
    claims
    0.46
    animals
    0.44
    edition
    0.43
     महंगाई
    0.43
    ingredients
    0.42
     drug
    0.42
    rope
    0.42
     animali
    0.42
     smelly
    0.41
    POSITIVE LOGITS
     algorithms
    1.06
     algorit
    0.99
     subroutine
    0.98
     алгорит
    0.97
     algorith
    0.95
     computational
    0.91
     algorithm
    0.89
     Algorithms
    0.89
     algoritmo
    0.88
     circuits
    0.88
    Act Density 0.223%

    No Known Activations