INDEX
    Explanations

    commands and expressions of obedience

    New Auto-Interp
    Negative Logits
    endiri
    -0.44
    InjectAttribute
    -0.42
    enderror
    -0.40
     springfox
    -0.40
    GeneratedCode
    -0.37
    dafx
    -0.35
     BorderSide
    -0.35
    grenze
    -0.35
     bouteilles
    -0.35
     avancée
    -0.35
    POSITIVE LOGITS
     obedience
    0.93
     Obedience
    0.88
     obey
    0.83
     obeyed
    0.83
    obedience
    0.83
     obé
    0.79
    obey
    0.75
     obeys
    0.75
     obed
    0.74
     obeying
    0.73
    Act Density 0.340%

    No Known Activations