INDEX
    Explanations

    unwanted sexual thoughts or urges

    New Auto-Interp
    Negative Logits
    unting
    0.45
    ances
    0.45
    ong
    0.44
    ،
    0.43
    rake
    0.41
    ación
    0.41
    tan
    0.40
    rant
    0.40
    ells
    0.40
    ían
    0.39
    POSITIVE LOGITS
    на
    0.59
     pensando
    0.54
    ;
    0.53
    ר
    0.52
    ла
    0.52
    n
    0.50
     gode
    0.50
    LE
    0.49
    ;}
    0.49
    р
    0.49
    Act Density 0.255%

    No Known Activations