INDEX
    Explanations

    instances of violence or injuries

    New Auto-Interp
    Negative Logits
     initComponents
    -0.47
    IBinder
    -0.47
     تضيفلها
    -0.44
     erw
    -0.41
    gheny
    -0.40
     izin
    -0.39
     Lorde
    -0.39
    ushchev
    -0.38
     Picchu
    -0.38
    mäßigen
    -0.36
    POSITIVE LOGITS
     ſte
    0.59
     ſta
    0.56
     faſt
    0.53
    ſelves
    0.52
     unicórnio
    0.51
     juſ
    0.49
    aarrggbb
    0.49
    retudo
    0.49
     ſont
    0.48
    pushFollow
    0.47
    Act Density 0.119%

    No Known Activations