INDEX
    Explanations

    words related to experimenting, tinkering, and tampering

    terms related to experimentation and manipulation

    New Auto-Interp
    Negative Logits
     Citation
    -0.78
     [+
    -0.68
    ħ
    -0.68
    åī
    -0.67
    ilipp
    -0.67
    ENS
    -0.67
    article
    -0.67
    vation
    -0.67
    RIP
    -0.65
    auri
    -0.65
    POSITIVE LOGITS
     tink
    1.10
     tweaks
    0.90
     tweaking
    0.88
     tweak
    0.85
     experimenting
    0.83
     havoc
    0.79
     nomine
    0.79
     experimented
    0.76
    ishly
    0.74
    olicy
    0.73
    Act Density 0.070%

    No Known Activations