INDEX
    Explanations

    descriptive terms followed by specific nouns

    New Auto-Interp
    Negative Logits
     turbines
    0.52
     schoolchildren
    0.52
     committees
    0.49
     museums
    0.48
     immort
    0.47
     neutrons
    0.47
     essays
    0.46
     labyr
    0.46
     examinations
    0.46
     loudspeakers
    0.46
    POSITIVE LOGITS
    类型
    0.57
    转换
    0.50
    曾经
    0.50
    颜值
    0.49
    Instagram
    0.48
    ིས་
    0.48
    0.48
    Emoji
    0.47
    COc
    0.47
    0.47
    Act Density 0.130%

    No Known Activations