INDEX
    Explanations

    statements related to personal reflection and self-awareness

    New Auto-Interp
    Negative Logits
     brid
    -0.83
     masked
    -0.80
     pressing
    -0.79
     continuous
    -0.79
     carriage
    -0.78
     raft
    -0.78
     applicable
    -0.77
     camp
    -0.76
     revers
    -0.76
     previously
    -0.76
    POSITIVE LOGITS
    And
    1.68
    Because
    1.54
    That
    1.54
    It
    1.52
    Then
    1.50
    But
    1.50
    If
    1.49
    Maybe
    1.49
    Advertisements
    1.49
    They
    1.48
    Act Density 0.409%

    No Known Activations