INDEX
    Explanations

    warnings or alerts

    New Auto-Interp
    Negative Logits
    hedral
    -0.82
    morph
    -0.78
    animate
    -0.77
    growth
    -0.70
    ablished
    -0.68
    ophon
    -0.67
    rencies
    -0.66
    rafted
    -0.66
    anova
    -0.65
    artney
    -0.65
    POSITIVE LOGITS
    warning
    1.00
     warning
    0.96
     warnings
    0.90
     Warn
    0.89
     Warning
    0.87
     disclaimer
    0.85
     Signs
    0.81
     warns
    0.80
    warn
    0.79
     warn
    0.76
    Act Density 0.030%

    No Known Activations