INDEX
    Explanations

    words related to the concept of belief or opinion

    the presence of the word "there" in various contexts

    New Auto-Interp
    Negative Logits
    EA
    -0.63
     Armored
    -0.60
     Dish
    -0.57
     Khe
    -0.57
    elta
    -0.55
     Maw
    -0.54
    ointed
    -0.54
     Cum
    -0.53
     Greenwich
    -0.53
    SEE
    -0.53
    POSITIVE LOGITS
    abouts
    1.51
    upon
    1.17
    fore
    1.03
     shouldn
    0.97
     isn
    0.95
     ain
    0.95
     wasn
    0.95
     weren
    0.93
     aren
    0.93
    after
    0.91
    Act Density 0.124%

    No Known Activations