INDEX
    Explanations

    numerical values denoting certainty or probability, particularly when expressed as percentages

    phrases indicating certainty or quantifiable metrics, especially those expressing percentages

    New Auto-Interp
    Negative Logits
    rift
    -0.78
    vati
    -0.73
    rang
    -0.72
    colm
    -0.71
    olas
    -0.70
    isen
    -0.70
    netflix
    -0.69
    achi
    -0.69
    akings
    -0.68
    attled
    -0.67
    POSITIVE LOGITS
    00000
    1.21
    %"
    1.03
    %]
    0.96
    0000
    0.95
    %;
    0.90
    =#
    0.89
    0000000
    0.89
    Percent
    0.86
    000000
    0.85
    000
    0.85
    Act Density 0.027%

    No Known Activations