INDEX
    Explanations

    phrases discussing moral and ethical standards

    after "the" or "and"

    New Auto-Interp
    Negative Logits
     String
    -0.62
     civilian
    -0.54
    String
    -0.53
     inflater
    -0.50
    字符串
    -0.50
     STRING
    -0.48
    STRING
    -0.48
     rato
    -0.46
    fieldLabel
    -0.45
    centralwidget
    -0.45
    POSITIVE LOGITS
     pictures
    0.74
     myſelf
    0.72
     itſelf
    0.70
     themſelves
    0.70
     Pictures
    0.69
     purpoſe
    0.68
     images
    0.66
     picture
    0.66
     visuals
    0.65
     himſelf
    0.65
    Act Density 0.335%

    No Known Activations