INDEX
Explanations
phrases related to bypassing security measures
terms related to evading or bypassing rules and regulations
New Auto-Interp
Negative Logits
Sad
-0.67
uesday
-0.67
killer
-0.66
soon
-0.66
mad
-0.64
iaries
-0.64
breaths
-0.64
oom
-0.63
asp
-0.62
aeus
-0.62
POSITIVE LOGITS
detection
1.20
prohibitions
0.85
FOIA
0.83
restrictions
0.81
regulations
0.81
censorship
0.81
bounds
0.79
Regulation
0.78
pesky
0.78
Detection
0.77
Activations Density 0.126%