INDEX
    Explanations

    phrases that articulate differences and distinctions between concepts or entities

    New Auto-Interp
    Negative Logits
     fede
    -0.48
     perchance
    -0.47
    adre
    -0.46
    /
    -0.45
     elsewhere
    -0.43
     Sotto
    -0.42
    ubi
    -0.42
    Westfalen
    -0.42
     givet
    -0.41
    手段
    -0.41
    POSITIVE LOGITS
     Differences
    1.50
     differences
    1.47
     difference
    1.47
    Differences
    1.47
     Difference
    1.44
     DIFFERENCE
    1.33
     Unterschied
    1.32
    Difference
    1.32
    difference
    1.32
     verschil
    1.30
    Act Density 0.345%

    No Known Activations