INDEX
    Explanations

    questions posed in the text

    New Auto-Interp
    Negative Logits
    hya
    -0.70
    bably
    -0.67
    OUND
    -0.67
    paio
    -0.64
    Statistics
    -0.61
    iful
    -0.59
    hematically
    -0.59
    undai
    -0.59
     effects
    -0.58
     outcomes
    -0.58
    POSITIVE LOGITS
    asks
    0.78
     asks
    0.65
     asked
    0.65
     inquired
    0.62
    asking
    0.62
     pond
    0.61
     quer
    0.59
     Bought
    0.58
     apologise
    0.58
    Quest
    0.57
    Act Density 0.018%

    No Known Activations