INDEX
    Explanations

    child sexual abuse and exploitation

    New Auto-Interp
    Negative Logits
     Deck
    0.49
     Dock
    0.47
     deck
    0.44
    家伙
    0.41
     docked
    0.40
    ्‍यादा
    0.40
     Ju
    0.39
     dock
    0.39
    Deck
    0.39
    cij
    0.39
    POSITIVE LOGITS
    hood
    0.65
    🧒
    0.65
     welfare
    0.64
     endanger
    0.59
    swear
    0.58
     rearing
    0.56
    bearing
    0.52
     prodig
    0.52
     Welfare
    0.51
    Hood
    0.50
    Act Density 0.025%

    No Known Activations