INDEX
    Explanations

    patterns involving specific numerical relationships and inequalities in structured data

    New Auto-Interp
    Negative Logits
     itſelf
    -1.76
     myſelf
    -1.69
     Efq
    -1.67
     ―――――
    -1.56
     ſind
    -1.54
     resourceCulture
    -1.53
     auffi
    -1.53
    )");
    
    -1.53
     Theſe
    -1.53
    )";
    
    -1.52
    POSITIVE LOGITS
    ↵↵
    1.01
    -
    1.01
    <eos>
    0.98
    0.94
    _
    0.85
    h
    0.82
    1
    0.81
    J
    0.80
    L
    0.79
    G
    0.79
    Act Density 0.190%

    No Known Activations