INDEX
    Explanations

    phrases describing imaginative scenarios

    hypothetical scenarios and thought experiments

    New Auto-Interp
    Negative Logits
    Nonetheless
    -0.77
    Nevertheless
    -0.75
    Exit
    -0.73
     Nevertheless
    -0.68
    "},
    -0.68
    particularly
    -0.67
     Nonetheless
    -0.66
    ason
    -0.65
     unfocusedRange
    -0.65
    so
    -0.63
    POSITIVE LOGITS
    Imagine
    1.10
     Imagine
    1.09
     scenario
    1.07
     hypot
    0.98
     imagine
    0.98
     hypothetical
    0.90
     scenarios
    0.89
     Suppose
    0.88
     dystopian
    0.81
     suddenly
    0.77
    Act Density 0.396%

    No Known Activations