INDEX
    Explanations

    logarithmic equations

    New Auto-Interp
    Negative Logits
    -0.09
     bustling
    -0.07
    快速
    -0.07
     rivalry
    -0.07
     clashes
    -0.07
    rit
    -0.07
     contrasting
    -0.07
     ам
    -0.07
     schnell
    -0.07
    Vs
    -0.07
    POSITIVE LOGITS
     Nested
    0.14
    nested
    0.13
    Nested
    0.13
     nested
    0.13
    _nested
    0.12
    0.12
     المستوى
    0.11
     nesting
    0.11
    三级
    0.10
     一级
    0.10
    Act Density 0.031%

    No Known Activations