INDEX
    Explanations

    terms related to "blow" or similar actions, as well as references to prompts and instructions

    New Auto-Interp
    Negative Logits
     Masson
    -0.45
     Perché
    -0.44
    VersionUID
    -0.43
    丁目
    -0.41
     saudara
    -0.41
     pregă
    -0.40
    Coordenadas
    -0.40
    beelden
    -0.40
     때문
    -0.40
    redients
    -0.38
    POSITIVE LOGITS
     blow
    0.88
     pump
    0.87
    pump
    0.81
     dump
    0.80
    Pump
    0.80
     Pump
    0.79
     prompt
    0.77
    dump
    0.77
     Blow
    0.76
    Blow
    0.73
    Act Density 0.098%

    No Known Activations