INDEX
    Explanations

    instances of critical errors, consequences, and health-related issues

    New Auto-Interp
    Negative Logits
    uros
    -0.15
    actors
    -0.15
    uffers
    -0.14
    QRST
    -0.14
     aggrav
    -0.14
     Durch
    -0.14
    rawn
    -0.14
    734
    -0.14
     intimidating
    -0.13
    enty
    -0.13
    POSITIVE LOGITS
     cost
    0.40
    cost
    0.39
    Cost
    0.34
     costs
    0.34
     Cost
    0.33
     COST
    0.33
    -cost
    0.32
    _cost
    0.29
    .cost
    0.29
     Costs
    0.29
    Act Density 0.265%

    No Known Activations