INDEX
    Explanations

    numerical values and their representations

    New Auto-Interp
    Negative Logits
    ']],
    -0.70
     autorytatywna
    -0.69
    iddhar
    -0.64
     Reif
    -0.63
    eſt
    -0.61
    czaj
    -0.61
    mantel
    -0.61
     refor
    -0.61
    '],
    
    -0.61
    égal
    -0.60
    POSITIVE LOGITS
    0
    1.70
    0.92
    0.89
    ۰
    0.88
    pellier
    0.84
    0.80
    ۰۰
    0.71
    Literals
    0.69
    𝟎
    0.69
    0.68
    Act Density 0.438%

    No Known Activations