INDEX
    Explanations

    words related to mathematical variables and operations

    New Auto-Interp
    Negative Logits
     previa
    -0.91
     Ceramby
    -0.90
     שוליים
    -0.79
    #+#
    -0.77
     Gente
    -0.75
     Sila
    -0.74
     Dooley
    -0.74
     Lerner
    -0.73
     Ferrell
    -0.72
     Monkey
    -0.70
    POSITIVE LOGITS
     d
    1.41
     D
    1.28
    D
    1.20
    d
    1.20
    getD
    1.10
    0.91
    д
    0.90
    Dd
    0.86
    PhysRevD
    0.82
    د
    0.81
    Act Density 0.223%

    No Known Activations