INDEX
    Explanations

    phrases that indicate the presence or existence of something

    New Auto-Interp
    Head Attr Weights
    0:0.02
    1:0.02
    2:0.09
    3:0.06
    4:0.14
    5:0.02
    6:0.04
    7:0.29
    8:0.02
    9:0.04
    10:0.08
    11:0.13
    Negative Logits
    ilage
    -1.49
     appre
    -1.46
     endeav
    -1.35
     endeavors
    -1.35
     ende
    -1.34
    aciously
    -1.32
    ancial
    -1.25
    NBC
    -1.25
    RG
    -1.25
    ��
    -1.22
    POSITIVE LOGITS
    Reviewer
    1.45
    abo
    1.34
     oneself
    1.32
    mite
    1.31
     Qué
    1.30
    ouls
    1.28
     Nanto
    1.28
     presence
    1.21
     witnesses
    1.18
     parentheses
    1.17
    Act Density 0.003%

    No Known Activations