INDEX
    Explanations

    comparisons or distinctions between different entities or concepts

    references to differences and distinctions between concepts or entities

    New Auto-Interp
    Negative Logits
    ãĥĦ
    -0.79
    ATA
    -0.79
    vez
    -0.76
    mberg
    -0.73
    onz
    -0.71
    ãĤ®
    -0.70
    rive
    -0.66
    ãĤ±
    -0.64
    å°Ĩ
    -0.63
    ico
    -0.62
    POSITIVE LOGITS
     between
    1.61
    between
    1.34
     Between
    1.26
    iveness
    1.08
     separating
    0.98
    iator
    0.97
    ials
    0.96
     maker
    0.94
    iating
    0.90
    Between
    0.85
    Act Density 0.058%

    No Known Activations