INDEX
    Explanations

    mathematical expressions or symbols related to equations

    New Auto-Interp
    Negative Logits
     itſelf
    -1.02
     myſelf
    -1.00
     Anſ
    -0.96
     ſeveral
    -0.93
     iſt
    -0.91
     auffi
    -0.88
     Majefty
    -0.88
     Reſ
    -0.87
     themſelves
    -0.85
     himſelf
    -0.85
    POSITIVE LOGITS
    Enllaces
    0.58
     Ver
    0.55
    <eos>
    0.54
    WriteLiteral
    0.53
     &
    0.51
     Sha
    0.50
    },[])
    0.50
    endforeach
    0.47
    liyor
    0.47
     "
    0.45
    Act Density 0.087%

    No Known Activations