INDEX
    Explanations

    numerical values and percentages mentioned as a figure or concept in the text

    New Auto-Interp
    Negative Logits
     tranquil
    -0.63
    avorite
    -0.63
     sclerosis
    -0.62
     repay
    -0.62
     haun
    -0.60
     merry
    -0.60
     regist
    -0.59
    Cause
    -0.58
    DEBUG
    -0.58
     discour
    -0.58
    POSITIVE LOGITS
    .,
    1.26
    .:
    0.89
    ross
    0.82
    .).
    0.79
    eter
    0.79
    aminer
    0.77
    hey
    0.77
    emonic
    0.77
    raphics
    0.76
    .,"
    0.75
    Act Density 0.010%

    No Known Activations