INDEX
    Explanations

    repeated mentions of the word "any"

    New Auto-Interp
    Negative Logits
     elig
    -0.69
    isable
    -0.69
     dilig
    -0.66
     fung
    -0.66
    PDATE
    -0.60
     unemploy
    -0.60
    ITNESS
    -0.60
    reditary
    -0.60
     leptin
    -0.59
     surv
    -0.58
    POSITIVE LOGITS
    where
    1.03
    one
    0.92
    ika
    0.92
    body
    0.87
    emi
    0.83
    uan
    0.83
    elle
    0.81
    heter
    0.81
    thood
    0.80
    THING
    0.80
    Act Density 0.005%

    No Known Activations