INDEX
    Explanations

    references to contrasting options or aspects within a scenario

    New Auto-Interp
    Negative Logits
     1915
    -0.72
    udence
    -0.59
     1903
    -0.57
     1919
    -0.56
     1906
    -0.55
     1918
    -0.55
     reintrodu
    -0.55
     1961
    -0.55
     2024
    -0.54
     1912
    -0.54
    POSITIVE LOGITS
    worldly
    1.90
     than
    1.09
    wise
    1.07
    etheless
    0.95
    ials
    0.89
    Redd
    0.88
    than
    0.87
    ially
    0.85
    parts
    0.83
     Than
    0.80
    Act Density 0.049%

    No Known Activations