INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     식으로
    0.39
     widow
    0.38
    0.37
     scrapbook
    0.37
     house
    0.36
     bathroom
    0.36
     poem
    0.36
     思い
    0.36
     or
    0.35
     coffee
    0.35
    POSITIVE LOGITS
    t
    0.59
    m
    0.53
    a
    0.46
    r
    0.45
    h
    0.45
    p
    0.42
    d
    0.41
    e
    0.40
    to
    0.40
    tom
    0.39
    Act Density 0.340%

    No Known Activations