INDEX
    Explanations

    references to figures or images in a technical context

    references to figures in the document

    New Auto-Interp
    Negative Logits
     administ
    -0.72
     convict
    -0.63
    RAW
    -0.61
     precious
    -0.61
     captcha
    -0.61
     offense
    -0.60
    ×IJ
    -0.59
     conditioning
    -0.58
    ngth
    -0.56
    administ
    -0.56
    POSITIVE LOGITS
    ures
    1.23
    uration
    1.21
    aro
    1.15
    ured
    1.09
    urations
    1.07
    uring
    1.06
    URE
    1.04
    wheel
    0.98
    URES
    0.97
    lio
    0.96
    Act Density 0.032%

    No Known Activations