INDEX
    Explanations

    discussions surrounding expectations, opinions, and realizations in various contexts

    New Auto-Interp
    Negative Logits
    '},
    
    -1.25
    ".
    
    -1.23
    "):
    
    -1.23
    ")){
    
    -1.21
    "},
    
    -1.20
    "){
    
    -1.17
    '),
    
    -1.15
    )"),
    -1.14
    "),
    
    -1.14
    ſelves
    -1.12
    POSITIVE LOGITS
    .
    1.30
    ,
    1.22
    !
    1.15
    ;
    1.14
    ?
    0.99
    :
    0.78
    !!
    0.77
     (
    0.71
     in
    0.69
    !!!
    0.68
    Act Density 0.957%

    No Known Activations