INDEX
    Explanations

    nouns related to technology and objects

    New Auto-Interp
    Negative Logits
     hopes
    -0.69
     regrets
    -0.69
    tains
    -0.62
     fears
    -0.61
     discovers
    -0.61
     grounds
    -0.60
     tries
    -0.60
     Grounds
    -0.60
     believes
    -0.60
     agrees
    -0.59
    POSITIVE LOGITS
     are
    1.17
     aren
    1.17
     ARE
    1.04
     comprise
    0.96
     differ
    0.93
     vary
    0.92
    are
    0.91
     weren
    0.91
     were
    0.90
     tend
    0.90
    Act Density 0.471%

    No Known Activations