INDEX
    Explanations

    numeric values or mathematical expressions

    New Auto-Interp
    Negative Logits
    <bos>
    -1.21
    ]='\
    -0.96
     autorytatywna
    -0.93
    ']))
    
    -0.92
    دانشنامهٔ
    -0.91
    ,:);
    -0.90
     Bonneville
    -0.90
    ']")
    -0.89
     }}"></
    -0.89
     للمعارف
    -0.89
    POSITIVE LOGITS
    2
    2.05
    3
    1.42
    4
    1.34
    5
    1.19
    6
    1.17
    1
    1.17
    0
    1.08
    7
    1.08
    8
    1.06
    two
    0.88
    Act Density 1.068%

    No Known Activations