INDEX
    Explanations

    terms and phrases related to health and safety warnings

    New Auto-Interp
    Negative Logits
    ãģ¾ãģļ
    -0.14
    uggy
    -0.14
    ÑĪов
    -0.14
    ãĥĩãĥ«
    -0.14
    ìļ´ëį°
    -0.13
    imedia
    -0.13
    оваÑĢи
    -0.13
    ÌĨ
    -0.13
    Except
    -0.13
    anging
    -0.13
    POSITIVE LOGITS
     similarly
    0.55
     Similarly
    0.52
    Similarly
    0.50
     Likewise
    0.45
     another
    0.43
     likewise
    0.43
     Dit
    0.38
     Another
    0.36
     same
    0.35
    Lik
    0.35
    Act Density 0.309%

    No Known Activations