INDEX
    Explanations

    phrases indicating comparison or similarity

    New Auto-Interp
    Negative Logits
     ModelExpression
    -0.69
     ſtand
    -0.54
    íč
    -0.53
     chofe
    -0.52
    -0.49
     οποία
    -0.48
     anſ
    -0.48
     Trabal
    -0.48
     uſed
    -0.48
     ſta
    -0.47
    POSITIVE LOGITS
     a
    1.23
     an
    0.97
     part
    0.73
    "]="
    0.72
    ]='\
    0.67
    0.67
    ]=="
    0.64
    =?";
    0.64
     полноцен
    0.64
     Bestandteil
    0.63
    Act Density 0.239%

    No Known Activations