INDEX
    Explanations

    descriptions emphasizing simplicity in various contexts

    New Auto-Interp
    Negative Logits
     <=",
    -0.89
    istoitu
    -0.82
    TintMode
    -0.81
     AssemblyCompany
    -0.81
     للمعارف
    -0.79
    WriteBarrier
    -0.78
    帖最后由
    -0.76
    potranspiration
    -0.76
    GEBURTS
    -0.75
    principalColumn
    -0.75
    POSITIVE LOGITS
     simple
    1.79
    simple
    1.66
     Simple
    1.66
    Simple
    1.65
     SIMPLE
    1.55
     simplicity
    1.50
     simples
    1.43
    SIMPLE
    1.42
     simpler
    1.36
     simpl
    1.34
    Act Density 0.189%

    No Known Activations