INDEX
    Explanations

    bracketed references

    New Auto-Interp
    Negative Logits
    -tax
    -0.08
    Life
    -0.07
    -done
    -0.07
    zug
    -0.07
     ",
    ↵
    -0.07
    rounded
    -0.06
    dx
    -0.06
    _fixed
    -0.06
    ,opt
    -0.06
    Answer
    -0.06
    POSITIVE LOGITS
     BaseEntity
    0.08
     contradictory
    0.06
     Kentucky
    0.06
    UGC
    0.06
     decidedly
    0.06
    ocrats
    0.06
     M
    0.06
     Phú
    0.06
     CONSTANT
    0.06
     singular
    0.06
    Act Density 0.032%

    No Known Activations