INDEX
    Explanations

    comparisons and contrasts, particularly involving "versus" or similar terms

    New Auto-Interp
    Negative Logits
    —"
    -0.56
     سكانية
    -0.55
    』。
    -0.54
     bezeichneter
    -0.54
    -------------</
    -0.52
    --.
    -0.52
    ']")
    -0.50
    ———-
    -0.50
    》.
    -0.49
    ──
    -0.48
    POSITIVE LOGITS
    .,
    1.08
    .:
    0.91
    .;
    0.81
    ./
    0.78
    oneofs
    0.78
    .!
    0.77
    adaptiveStyles
    0.75
     Hwy
    0.72
     betweenstory
    0.72
    etera
    0.71
    Act Density 0.446%

    No Known Activations