INDEX
    Explanations

    questions starting with "What would" or similar variations

    conditional phrases or hypothetical situations

    New Auto-Interp
    Negative Logits
    haus
    -0.75
    fort
    -0.68
    fox
    -0.67
    ibaba
    -0.67
    anie
    -0.64
    cule
    -0.64
    ledge
    -0.64
    belt
    -0.64
    hill
    -0.63
    skirts
    -0.63
    POSITIVE LOGITS
    ?]
    0.78
     happen
    0.74
     millenn
    0.68
     entail
    0.63
     theolog
    0.61
     deg
    0.60
    ENTS
    0.59
     reconc
    0.59
     corrections
    0.59
    ?)
    0.59
    Act Density 0.076%

    No Known Activations