INDEX
    Explanations

    words and phrases related to adjectives and their usage in modifying nouns

    New Auto-Interp
    Negative Logits
    abad
    -0.15
    anz
    -0.14
     LAB
    -0.14
    bob
    -0.14
     Made
    -0.14
     made
    -0.13
    oub
    -0.13
    itia
    -0.13
     Reserved
    -0.13
    omap
    -0.13
    POSITIVE LOGITS
     targeted
    0.29
     target
    0.28
    target
    0.25
     ÙħÙĪØ±Ø¯
    0.24
     targ
    0.23
     targets
    0.23
    covered
    0.21
    -target
    0.21
    Target
    0.21
     involved
    0.21
    Act Density 0.291%

    No Known Activations