INDEX
    Explanations

    terms related to observation and monitoring

    New Auto-Interp
    Negative Logits
     A
    -0.70
     P
    -0.68
    P
    -0.65
     The
    -0.64
    -0.62
     All
    -0.61
    <eos>
    -0.60
     K
    -0.60
     C
    -0.58
     It
    -0.58
    POSITIVE LOGITS
    observations
    1.53
     Observations
    1.53
     observations
    1.51
     Observation
    1.49
     OBSERV
    1.47
     obser
    1.46
     observes
    1.45
     Observ
    1.43
     observation
    1.41
    OBSERV
    1.39
    Act Density 0.079%

    No Known Activations