INDEX
    Explanations

    phrases indicating uncertainty or hesitation

    New Auto-Interp
    Negative Logits
    ãĥŀ
    -0.57
     abst
    -0.54
     depreciation
    -0.54
    REDACTED
    -0.54
     sow
    -0.53
     1886
    -0.51
     ol
    -0.51
     rupture
    -0.50
     coincide
    -0.50
     avg
    -0.48
    POSITIVE LOGITS
    pering
    0.74
    borgh
    0.73
    thing
    0.65
    lins
    0.65
    paste
    0.65
    forward
    0.64
    tree
    0.63
    links
    0.62
    ston
    0.62
    emade
    0.61
    Act Density 5.837%

    No Known Activations