INDEX
    Explanations

    instances of the word "half" followed by a number indicating a proportion

    New Auto-Interp
    Negative Logits
     concess
    -0.64
    ensis
    -0.62
    hran
    -0.59
    lict
    -0.57
    Reviewer
    -0.55
    berus
    -0.55
    anwhile
    -0.55
    andr
    -0.55
     destro
    -0.54
    kson
    -0.54
    POSITIVE LOGITS
    heartedly
    0.79
     dozen
    0.78
    imet
    0.69
    çͰ
    0.69
     percent
    0.67
    wheel
    0.65
     of
    0.65
    azo
    0.64
    pipe
    0.63
    hearted
    0.61
    Act Density 0.028%

    No Known Activations