INDEX
    Explanations

    expressions of excitement or happiness

    expressions of excitement or strong positive emotions

    New Auto-Interp
    Negative Logits
    ciplinary
    -0.85
    poral
    -0.81
    enhagen
    -0.78
    road
    -0.78
    lay
    -0.73
    icrobial
    -0.72
    dule
    -0.70
    sterdam
    -0.68
    prison
    -0.67
    icipated
    -0.67
    POSITIVE LOGITS
     exclaim
    0.72
    ION
    0.69
     exclaimed
    0.67
     VID
    0.66
     delight
    0.65
    iously
    0.64
     delighted
    0.64
     Euros
    0.64
     Romeo
    0.64
    ÃįÃį
    0.64
    Act Density 0.049%

    No Known Activations