INDEX
    Explanations

    logical expressions

    New Auto-Interp
    Negative Logits
    uang
    -0.08
    -0.08
     муз
    -0.08
    entre
    -0.08
    -0.08
    nyder
    -0.08
     boll
    -0.08
     plastic
    -0.08
     exfol
    -0.08
    -0.08
    POSITIVE LOGITS
     propositions
    0.11
     conjunction
    0.11
     propos
    0.11
     Lisp
    0.11
     weakening
    0.10
    形式
    0.10
     conjunct
    0.10
     XOR
    0.10
     contrap
    0.10
    .logical
    0.09
    Act Density 0.031%

    No Known Activations