INDEX
Explanations
commands and expressions of obedience
New Auto-Interp
Negative Logits
endiri
-0.44
InjectAttribute
-0.42
enderror
-0.40
springfox
-0.40
GeneratedCode
-0.37
dafx
-0.35
BorderSide
-0.35
grenze
-0.35
bouteilles
-0.35
avancée
-0.35
POSITIVE LOGITS
obedience
0.93
Obedience
0.88
obey
0.83
obeyed
0.83
obedience
0.83
obé
0.79
obey
0.75
obeys
0.75
obed
0.74
obeying
0.73
Activations Density 0.340%