INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     bits
    -0.07
     Gilbert
    -0.07
    -links
    -0.07
     vidé
    -0.07
     depth
    -0.07
     between
    -0.07
    -city
    -0.07
     poetic
    -0.06
     Tibet
    -0.06
     colour
    -0.06
    POSITIVE LOGITS
     assume
    0.15
     assumed
    0.13
     assuming
    0.12
     assumption
    0.11
     Assume
    0.11
     assumptions
    0.11
    assume
    0.10
     assum
    0.10
     assumes
    0.10
     Assuming
    0.10
    Act Density 0.023%

    No Known Activations