INDEX
    Explanations

    information relating to quantities or durations

    phrases indicating approximate quantities or durations

    New Auto-Interp
    Negative Logits
    ode
    -0.68
    idia
    -0.66
    rog
    -0.64
    NEY
    -0.64
    owler
    -0.62
    achus
    -0.62
    ESE
    -0.62
    agate
    -0.60
    ells
    -0.60
    etts
    -0.59
    POSITIVE LOGITS
    200
    0.84
    300
    0.82
    600
    0.80
    700
    0.76
    500
    0.74
    capacity
    0.74
    900
    0.73
    800
    0.73
     200
    0.72
    80
    0.71
    Act Density 0.098%

    No Known Activations