INDEX
    Explanations

    mathematical proofs

    New Auto-Interp
    Negative Logits
     skat
    -0.09
     Azerbaijan
    -0.08
     nob
    -0.08
     enigmatic
    -0.08
     mundane
    -0.08
     somewhere
    -0.08
     ske
    -0.08
     Tribe
    -0.08
     reunion
    -0.07
     Airbus
    -0.07
    POSITIVE LOGITS
     sufficiently
    0.09
     δ
    0.08
    ittens
    0.08
    .radius
    0.08
    .small
    0.08
    Ɛ
    0.08
    (||
    0.08
    _delta
    0.08
     shrinking
    0.08
    _radius
    0.08
    Act Density 0.009%

    No Known Activations