INDEX
    Explanations

    various forms of the word "honest" and related concepts

    New Auto-Interp
    Negative Logits
    tery
    -0.18
    lassian
    -0.17
    hlen
    -0.16
    .scalablytyped
    -0.16
    sWith
    -0.16
    izi
    -0.15
    uteur
    -0.15
    ture
    -0.14
    sko
    -0.14
    rost
    -0.14
    POSITIVE LOGITS
     Abe
    0.22
    -to
    0.20
     broker
    0.20
    /auth
    0.19
     brokers
    0.18
    ably
    0.18
     appraisal
    0.17
    /raw
    0.17
    bones
    0.15
     Broker
    0.15
    Act Density 0.036%

    No Known Activations