INDEX
    Explanations

    expressions related to beliefs, thoughts, and assumptions

    New Auto-Interp
    Negative Logits
     Acerca
    -0.61
    him
    -0.60
     Darauf
    -0.59
     honom
    -0.59
    évaluateur
    -0.58
    对我
    -0.55
     otomatig
    -0.53
    对他
    -0.53
     didst
    -0.52
    BuilderFactory
    -0.52
    POSITIVE LOGITS
     they
    2.24
     we
    1.77
     there
    1.76
     it
    1.32
     he
    1.27
     she
    1.16
    they
    1.16
     you
    1.10
    there
    1.03
     они
    1.02
    Act Density 0.626%

    No Known Activations