INDEX
    Explanations

    numbers in textual formats

    occurrences of various numerical references and related terms

    New Auto-Interp
    Negative Logits
    hips
    -0.89
    loo
    -0.83
    WARD
    -0.70
    wards
    -0.69
    ioned
    -0.68
     Denis
    -0.67
    fully
    -0.66
     Hilton
    -0.63
    lain
    -0.63
     Leopard
    -0.61
    POSITIVE LOGITS
    eral
    1.05
    ero
    1.04
    posium
    1.03
    pty
    1.03
    mus
    0.95
    phony
    0.92
    bs
    0.92
    pt
    0.87
    asm
    0.86
    urg
    0.86
    Act Density 0.042%

    No Known Activations