INDEX
    Explanations

    phrases expressing personal opinions

    expressions of personal opinions and viewpoints

    New Auto-Interp
    Negative Logits
     deposited
    -0.70
    bard
    -0.67
    artney
    -0.64
    vier
    -0.61
    iao
    -0.59
    kefeller
    -0.59
    etting
    -0.58
     contamin
    -0.56
    ça
    -0.56
    bor
    -0.56
    POSITIVE LOGITS
    æĦ
    0.73
    Īè
    0.71
    VIDIA
    0.69
    DEV
    0.67
    âĶĢâĶĢâĶĢâĶĢ
    0.67
    inguishable
    0.67
    ²¾
    0.65
    à¼
    0.65
    HO
    0.65
     opin
    0.64
    Act Density 0.033%

    No Known Activations