INDEX
    Explanations

    expressions of trust or mistrust

    New Auto-Interp
    Negative Logits
     poffible
    -1.48
     myſelf
    -1.46
     Diſ
    -1.44
     Houſe
    -1.43
     Theſe
    -1.42
     greateſt
    -1.40
     ſmall
    -1.39
     Anſ
    -1.38
     Reſ
    -1.37
     houſe
    -1.35
    POSITIVE LOGITS
     trust
    0.81
      
    0.62
    <eos>
    0.61
     and
    0.60
    0.60
     (
    0.57
     design
    0.57
     as
    0.56
    ,
    0.56
     in
    0.55
    Act Density 0.141%

    No Known Activations