INDEX
    Explanations

    mentions of "linear" and associated mathematical concepts

    New Auto-Interp
    Negative Logits
    uki
    -0.16
    alian
    -0.16
    vů
    -0.14
    DES
    -0.14
    arta
    -0.14
    aire
    -0.14
    /views
    -0.14
    uffy
    -0.13
    DED
    -0.13
    [Unit
    -0.13
    POSITIVE LOGITS
    ÑĢд
    0.16
    erin
    0.15
    ford
    0.14
    ajs
    0.14
    erosis
    0.14
    رد
    0.14
    ampton
    0.14
     stamps
    0.14
    _batches
    0.14
    xfff
    0.13
    Act Density 0.009%

    No Known Activations