INDEX
    Explanations

    references to humanitarian aid and cultural identity

    New Auto-Interp
    Negative Logits
    haps
    -0.16
    ocs
    -0.15
    ouncil
    -0.14
     дÑĢÑĥгого
    -0.14
    çļĦä¸Ģ个
    -0.14
    uhl
    -0.14
    à¸ĩหมà¸Ķ
    -0.13
    isas
    -0.13
    urope
    -0.13
    chw
    -0.13
    POSITIVE LOGITS
     both
    0.56
    both
    0.53
     BOTH
    0.46
     two
    0.45
    Both
    0.43
     Both
    0.42
    两个
    0.41
     respectively
    0.40
    _both
    0.40
     beiden
    0.40
    Act Density 0.501%

    No Known Activations