INDEX
    Explanations

    adjectives that express strong or negative qualities

    New Auto-Interp
    Head Attr Weights
    0:0.02
    1:0.02
    2:0.06
    3:0.06
    4:0.05
    5:0.04
    6:0.41
    7:0.11
    8:0.04
    9:0.04
    10:0.05
    11:0.05
    Negative Logits
    DAY
    -1.36
     Centauri
    -1.23
     neighb
    -1.21
    ゴン
    -1.21
     Dinosaur
    -1.19
     Za
    -1.18
    Pool
    -1.10
     Jeanne
    -1.10
     accuser
    -1.10
     Xia
    -1.10
    POSITIVE LOGITS
     (>
    1.57
    arnaev
    1.51
    ersive
    1.45
    cientious
    1.40
    anto
    1.39
    rez
    1.37
    ileged
    1.35
    esta
    1.35
     agric
    1.34
    tarian
    1.31
    Act Density 0.012%

    No Known Activations