INDEX
    Explanations

    idiomatic expressions and phrases from various contexts

    references to outcomes that involve winning or losing in various contexts

    New Auto-Interp
    Negative Logits
    76561
    -0.69
     newsp
    -0.60
     repertoire
    -0.58
     orche
    -0.57
     referen
    -0.57
     cir
    -0.53
    vernment
    -0.52
     RELE
    -0.52
     Tradable
    -0.52
     phased
    -0.51
    POSITIVE LOGITS
    ;}
    0.97
    .--
    0.90
    ;
    0.90
    .—
    0.88
    .
    0.84
    .;
    0.81
    ():
    0.79
    '';
    0.78
    .?
    0.76
    .#
    0.76
    Act Density 1.010%

    No Known Activations