INDEX
    Explanations

    sentences related to counting or measuring

    New Auto-Interp
    Negative Logits
    ieties
    -0.66
    Oracle
    -0.65
     conditioning
    -0.64
     obser
    -0.63
     Harbor
    -0.62
     Bree
    -0.62
     waivers
    -0.61
    insula
    -0.61
     Centauri
    -0.60
     habit
    -0.59
    POSITIVE LOGITS
    enance
    1.78
    downs
    0.89
    esses
    0.87
    rified
    0.86
    ess
    0.82
    ries
    0.77
    icates
    0.77
    ensen
    0.76
    icated
    0.74
    eenth
    0.74
    Act Density 0.629%

    No Known Activations