INDEX
    Explanations

    expressions of agreement and consensus

    New Auto-Interp
    Negative Logits
     Efq
    -1.17
     Theſe
    -1.17
     Houſe
    -1.12
     itſelf
    -1.11
     Jefus
    -1.09
     Monfieur
    -1.08
     Anſ
    -1.05
     Reſ
    -1.03
     Beſ
    -1.00
     myſelf
    -0.99
    POSITIVE LOGITS
    ו
    0.78
    0.69
     agreed
    0.68
     (
    0.65
     R
    0.62
    agre
    0.60
     A
    0.59
     agree
    0.59
     AGRE
    0.58
    <eos>
    0.58
    Act Density 0.163%

    No Known Activations