INDEX
    Explanations

    questions and statements that challenge the status quo or express skepticism

    New Auto-Interp
    Head Attr Weights
    0:0.04
    1:0.06
    2:0.02
    3:0.10
    4:0.07
    5:0.34
    6:0.06
    7:0.03
    8:0.07
    9:0.11
    10:0.03
    11:0.03
    Negative Logits
     Berm
    -2.07
     Helic
    -1.95
     oak
    -1.91
     crane
    -1.91
     Misty
    -1.89
     Windsor
    -1.88
     Cherokee
    -1.88
     Cobra
    -1.87
     Lizard
    -1.87
     Bermuda
    -1.86
    POSITIVE LOGITS
     meaningless
    2.41
     anyways
    2.37
     shouldn
    2.32
     detriment
    2.28
     inefficient
    2.23
    urden
    2.23
     anyway
    2.21
     useless
    2.18
     inevitably
    2.14
    hemer
    2.13
    Act Density 0.004%

    No Known Activations