INDEX
    Explanations

    phrases related to connections and relationships between people or entities

    phrases that contain slashes, indicating divisions or categories in text

    New Auto-Interp
    Negative Logits
     swell
    -0.80
     square
    -0.78
     lifetime
    -0.78
     densely
    -0.77
     sincerely
    -0.75
     Chao
    -0.74
     younger
    -0.74
     mate
    -0.73
     leaflets
    -0.73
     breed
    -0.73
    POSITIVE LOGITS
    whatever
    1.67
    etc
    1.67
    dist
    1.46
    coll
    1.46
    trans
    1.45
    tem
    1.44
    super
    1.42
    non
    1.40
    control
    1.40
    workshop
    1.40
    Act Density 0.052%

    No Known Activations