INDEX
    Explanations

    questions about the truth or nature of statements made

    questions that challenge assumptions or beliefs

    New Auto-Interp
    Negative Logits
    bies
    -0.90
    dit
    -0.79
    usters
    -0.77
    tions
    -0.73
    papers
    -0.71
    Topics
    -0.71
    umbn
    -0.69
    ixels
    -0.69
    ventures
    -0.68
    former
    -0.67
    POSITIVE LOGITS
     conceivable
    1.02
     really
    0.95
     worth
    0.91
     possible
    0.89
     Possible
    0.88
     ever
    0.87
     Really
    0.85
     worthwhile
    0.83
     feasible
    0.83
     REALLY
    0.83
    Act Density 0.057%

    No Known Activations