INDEX
    Explanations

    naming concepts and their attributes

    New Auto-Interp
    Negative Logits
    ?
    0.43
    !
    0.43
    あくまで
    0.41
    "
    0.41
     degenerative
    0.41
     immun
    0.41
     epigenetic
    0.40
     marinade
    0.40
    immune
    0.39
     phyt
    0.39
    POSITIVE LOGITS
    рт
    0.45
     años
    0.45
    ಿಗೆ
    0.44
     যশোরে
    0.44
    සිය
    0.43
    یس
    0.43
     关闭
    0.43
    Мар
    0.42
     passos
    0.41
    Го
    0.41
    Act Density 0.059%

    No Known Activations