INDEX
    Explanations

    phrases indicating progress or development

    New Auto-Interp
    Negative Logits
     ongoing
    -0.17
    avig
    -0.16
    isters
    -0.15
    roj
    -0.15
    uous
    -0.14
    edis
    -0.14
     Trap
    -0.14
    reetings
    -0.14
    idis
    -0.14
    enas
    -0.14
    POSITIVE LOGITS
     lengths
    0.21
     extremes
    0.19
     tangent
    0.19
     motions
    0.19
     fishing
    0.18
     tilt
    0.17
     routes
    0.17
     route
    0.17
     broke
    0.17
     crazy
    0.17
    Act Density 0.107%

    No Known Activations