INDEX
    Explanations

    words connected to legality and moral implications

    New Auto-Interp
    Negative Logits
    ugeot
    -0.68
     continúas
    -0.65
    rews
    -0.65
     képes
    -0.64
    agascar
    -0.63
    ignty
    -0.62
    -0.60
    thasone
    -0.59
    erapeutic
    -0.58
    uests
    -0.58
    POSITIVE LOGITS
    MessageTagHelper
    0.70
    ...");
    
    0.68
    PreferredItem
    0.68
    oa̍t
    0.68
    ...
    
    0.67
    AndEndTag
    0.66
    ArgsConstructor
    0.65
    !
    
    0.65
    ization
    0.64
    ?—
    0.64
    Act Density 1.982%

    No Known Activations