INDEX
    Explanations

    phrases indicating comparisons or similarities

    New Auto-Interp
    Head Attr Weights
    0:0.02
    1:0.09
    2:0.12
    3:0.08
    4:0.02
    5:0.03
    6:0.08
    7:0.12
    8:0.05
    9:0.07
    10:0.06
    11:0.22
    Negative Logits
     helicop
    -1.31
     contrace
    -1.28
     calcul
    -1.18
     trave
    -1.17
     unwittingly
    -1.16
     coerc
    -1.14
     equivalents
    -1.09
     complicit
    -1.08
     sufficiently
    -1.08
     prematurely
    -1.08
    POSITIVE LOGITS
     Same
    1.49
    rar
    1.25
    natureconservancy
    1.25
    Same
    1.20
    lishes
    1.17
    yrs
    1.11
    >)
    1.11
    zech
    1.08
    same
    1.06
     Nina
    1.05
    Act Density 0.004%

    No Known Activations