INDEX
    Explanations

    words related to trying out new methods or ideas

    instances of the word "experiment" and its variations, indicating a focus on experimentation across various contexts

    New Auto-Interp
    Negative Logits
    othy
    -0.68
    translation
    -0.67
    cut
    -0.66
    si
    -0.66
    sama
    -0.64
    HCR
    -0.64
    say
    -0.64
    byn
    -0.63
    vation
    -0.62
    trans
    -0.62
    POSITIVE LOGITS
     experimenting
    1.11
     experimented
    0.98
     withd
    0.94
     experimentation
    0.89
    iments
    0.85
     tink
    0.84
    imental
    0.82
    ienced
    0.78
    redients
    0.77
    isher
    0.74
    Act Density 0.015%

    No Known Activations