INDEX
    Explanations

    phrases indicating uncertainty or choice

    repeated usage of the word "which" across various contexts

    New Auto-Interp
    Negative Logits
    Rog
    -0.81
    GROUND
    -0.79
    Balt
    -0.76
    kj
    -0.73
    bly
    -0.72
    kamp
    -0.71
    UX
    -0.71
    FINE
    -0.69
     Glob
    -0.69
    fitting
    -0.67
    POSITIVE LOGITS
     kinds
    0.91
     sorts
    0.81
    soever
    0.78
     redes
    0.76
     types
    0.74
     direction
    0.70
     kind
    0.70
     flavors
    0.69
     contingency
    0.68
     ones
    0.66
    Act Density 0.051%

    No Known Activations