INDEX
    Explanations

    questions or phrases that inquire about types or categories of things

    New Auto-Interp
    Negative Logits
    ATIONS
    -0.69
    uble
    -0.67
    EF
    -0.65
    ELL
    -0.65
     Ess
    -0.61
     Drift
    -0.59
     Et
    -0.57
    itations
    -0.57
     THREE
    -0.57
     Prelude
    -0.55
    POSITIVE LOGITS
    of
    0.98
     of
    0.93
    luster
    0.80
    nesses
    0.72
    achu
    0.69
    icles
    0.68
     thereof
    0.67
     OF
    0.64
     Of
    0.64
    oft
    0.64
    Act Density 0.032%

    No Known Activations