INDEX
    Explanations

    phrases that express comparisons and contrasts between different concepts or entities

    New Auto-Interp
    Negative Logits
     Andersen
    -0.76
    aeda
    -0.74
    enhagen
    -0.73
    ilogy
    -0.72
    akuya
    -0.71
     PsyNet
    -0.70
    ocument
    -0.70
    document
    -0.70
    \\\\\\\\
    -0.69
    elsen
    -0.67
    POSITIVE LOGITS
    Medium
    1.48
     medium
    1.43
     Medium
    1.36
     low
    1.19
    medium
    1.18
     shallow
    1.13
    small
    1.13
     Low
    1.10
    Low
    1.09
    low
    1.08
    Act Density 3.358%

    No Known Activations