INDEX
    Explanations

    numerical references and citations in the text

    Citations or footnotes by bracketed numbers

    New Auto-Interp
    Negative Logits
     thing
    -1.45
     THING
    -1.08
     things
    -1.03
    thing
    -1.02
    things
    -0.98
     Things
    -0.95
     Thing
    -0.94
    Things
    -0.91
    Thing
    -0.88
     THINGS
    -0.86
    POSITIVE LOGITS
    8
    0.65
    7
    0.64
    4
    0.63
    9
    0.63
    6
    0.63
    5
    0.57
    3
    0.56
    CppMethod
    0.54
     Wider
    0.50
     octaves
    0.49
    Act Density 1.265%

    No Known Activations