INDEX
    Explanations

    f/ref followed by specific words

    New Auto-Interp
    Negative Logits
     Estr
    0.79
     Kim
    0.73
     imperative
    0.72
     kim
    0.72
     Abandon
    0.69
     visceral
    0.69
     abandon
    0.69
     dep
    0.68
     pri
    0.68
     Xbox
    0.68
    POSITIVE LOGITS
    avorable
    1.39
    requent
    1.29
    amiliar
    1.28
    requently
    1.27
    avourable
    1.27
    iciency
    1.25
    ashion
    1.21
    ucking
    1.19
    ashions
    1.18
    icient
    1.18
    Act Density 0.288%

    No Known Activations