INDEX
    Explanations

    phrases indicating novelty or surprise

    phrases indicating experiences of novelty or uniqueness

    New Auto-Interp
    Negative Logits
    Tips
    -0.70
    lear
    -0.67
    reg
    -0.65
    ŃĶ
    -0.64
    fund
    -0.63
    supp
    -0.62
    haus
    -0.61
    absor
    -0.61
    ERT
    -0.61
    relations
    -0.60
    POSITIVE LOGITS
     anything
    0.95
     anybody
    0.88
     anyone
    0.81
     daylight
    0.77
     ANY
    0.71
     anywhere
    0.70
     dime
    0.70
     them
    0.68
     nor
    0.68
     him
    0.67
    Act Density 0.070%

    No Known Activations