INDEX
    Explanations

    questions that are posed in a rhetorical or philosophical context

    New Auto-Interp
    Negative Logits
    .
    -0.54
    -0.52
     In
    -0.50
     L
    -0.47
     D
    -0.47
     The
    -0.46
     K
    -0.45
    </i>
    -0.44
     I
    -0.44
    DED
    -0.44
    POSITIVE LOGITS
    ?
    
    1.95
    %?
    1.75
    ?—
    1.69
    ?}
    1.69
    ?
    1.62
    ?’
    1.61
    ?&
    1.60
    ?<
    1.59
    ?”
    1.59
    ?"
    1.58
    Act Density 0.147%

    No Known Activations