INDEX
    Explanations

    terms related to risk reduction and safety mechanisms

    New Auto-Interp
    Negative Logits
     ÐĿÑĥ
    -0.13
     повин
    -0.13
    444
    -0.13
    ansa
    -0.13
    çĸ
    -0.13
     guts
    -0.13
    agnostic
    -0.13
    vem
    -0.13
    linger
    -0.13
    agra
    -0.12
    POSITIVE LOGITS
     drafts
    0.20
     premature
    0.20
     sag
    0.18
     reflections
    0.17
     foreign
    0.17
     пÑĢеж
    0.17
    Build
    0.17
     ghost
    0.16
     undue
    0.16
     excessive
    0.16
    Act Density 0.214%

    No Known Activations