INDEX
    Explanations

    instances of effort, research, and planning in various contexts

    New Auto-Interp
    Negative Logits
    ollider
    -0.16
    geç
    -0.15
    taÅŁ
    -0.13
    ichert
    -0.13
    enant
    -0.13
    èĥ½åĬĽ
    -0.13
    .Binding
    -0.12
    andler
    -0.12
    ensburg
    -0.12
     Ending
    -0.12
    POSITIVE LOGITS
     research
    0.33
     investigation
    0.28
     deliber
    0.27
     contempl
    0.27
     strateg
    0.27
     thought
    0.26
     analysis
    0.26
     thinking
    0.26
     detective
    0.25
     experimentation
    0.25
    Act Density 0.342%

    No Known Activations