INDEX
    Explanations

    words related to experimentation and trying out new things

    instances of experimentation and related concepts

    New Auto-Interp
    Negative Logits
     Cheong
    -0.72
    cube
    -0.69
     Heb
    -0.67
    translation
    -0.67
    mary
    -0.65
     Supporting
    -0.65
    lat
    -0.64
    die
    -0.64
    ens
    -0.63
    trans
    -0.63
    POSITIVE LOGITS
     experimenting
    1.10
     experimented
    0.98
     tink
    0.96
     withd
    0.96
     experimentation
    0.95
    odox
    0.90
    redients
    0.89
    quished
    0.87
    GGGGGGGG
    0.85
    iences
    0.74
    Act Density 0.014%

    No Known Activations