INDEX
    Explanations

    situations where clarity or explicitness is emphasized

    New Auto-Interp
    Negative Logits
     Niet
    -0.65
     horizont
    -0.63
    rought
    -0.59
    cheon
    -0.57
    idth
    -0.57
     exceeded
    -0.57
    rity
    -0.56
    verages
    -0.56
    Root
    -0.56
     spearheaded
    -0.55
    POSITIVE LOGITS
     disappear
    0.80
     noises
    0.79
     impression
    0.77
     happen
    0.76
     debut
    0.76
     mistake
    0.76
    sense
    0.75
    ends
    0.73
     Murd
    0.72
     noise
    0.71
    Act Density 1.243%

    No Known Activations