INDEX
    Explanations

    occurrences of the word "over"

    New Auto-Interp
    Negative Logits
    ly
    -0.29
    shan
    -0.23
    place
    -0.22
    eous
    -0.19
    wards
    -0.19
    whel
    -0.19
    icularly
    -0.19
    bben
    -0.18
    象
    -0.18
    ships
    -0.17
    POSITIVE LOGITS
    hang
    0.25
    ture
    0.25
    tures
    0.20
    kill
    0.19
    age
    0.18
    heid
    0.18
    views
    0.18
    ha
    0.18
    iew
    0.17
    ature
    0.17
    Act Density 0.030%

    No Known Activations