INDEX
    Explanations

    quantitative references or estimates related to research findings

    Followed by nouns indicating quantity

    many, several, some, certain

    New Auto-Interp
    Negative Logits
    both
    -0.60
    X
    -0.52
    J
    -0.52
    E
    -0.51
    /
    -0.51
    H
    -0.50
    the
    -0.50
    Most
    -0.49
     thingy
    -0.49
    Z
    -0.49
    POSITIVE LOGITS
    '],
    
    0.89
    "],
    
    0.87
    NUMX
    0.87
    >--}}
    0.87
     notable
    0.86
    ]();
    0.86
    .}(
    0.83
    "]);
    
    0.83
    ]:
    
    0.83
     kinds
    0.82
    Act Density 0.780%

    No Known Activations