INDEX
    Explanations

    variations of the word "operate" across different contexts

    New Auto-Interp
    Negative Logits
    led
    -0.18
    olic
    -0.18
     Fathers
    -0.17
    boxed
    -0.16
    e
    -0.16
    ey
    -0.15
    ing
    -0.15
    eper
    -0.15
    itary
    -0.15
    eken
    -0.14
    POSITIVE LOGITS
    ational
    0.24
    аÑĤив
    0.22
    etta
    0.22
    atings
    0.20
    ativ
    0.19
    ative
    0.18
    ATIONAL
    0.18
    ATORS
    0.18
    atives
    0.17
    -oper
    0.17
    Act Density 0.006%

    No Known Activations