INDEX
    Explanations

    phrases indicating a range or approximation

    phrases that indicate a range of conditions or variations

    New Auto-Interp
    Negative Logits
     corridors
    -0.56
    Monitor
    -0.54
    CHR
    -0.53
     challeng
    -0.53
     Puzz
    -0.52
    afety
    -0.52
     horizont
    -0.51
     Vector
    -0.51
    verages
    -0.51
     Traps
    -0.50
    POSITIVE LOGITS
    nery
    1.10
     less
    0.94
    leans
    0.93
    nam
    0.86
    gin
    0.86
    phans
    0.84
    acular
    0.84
    chid
    0.82
    acle
    0.80
     fewer
    0.77
    Act Density 0.018%

    No Known Activations