INDEX
    Explanations

    sentences with evaluative language or expressions of opinion regarding various subjects

    New Auto-Interp
    Negative Logits
     either
    -0.25
     Either
    -0.22
     instead
    -0.21
     even
    -0.21
    either
    -0.20
    竣
    -0.20
     simply
    -0.19
    Either
    -0.19
    instead
    -0.19
     einfach
    -0.18
    POSITIVE LOGITS
     certainly
    0.28
     initially
    0.28
     technically
    0.26
     nomin
    0.25
     may
    0.24
     Certainly
    0.22
    may
    0.21
     Initially
    0.21
     occasionally
    0.20
    initial
    0.20
    Act Density 0.394%

    No Known Activations