INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    $")
    -1.21
    )");
    
    -1.13
     whoſe
    -1.13
    BibitemShut
    -1.09
    — 
    -1.07
     pleaſure
    -1.07
    "):
    
    -1.06
    numerusform
    -1.05
    ")));
    
    -1.05
     >=",
    -1.05
    POSITIVE LOGITS
    (
    0.83
    0.75
    '
    0.70
    I
    0.69
    L
    0.68
    .
    0.65
    D
    0.65
    es
    0.65
    ,
    0.63
    S
    0.62
    Act Density 1.392%

    No Known Activations