INDEX
    Explanations

    words indicating emphasis or importance

    New Auto-Interp
    Negative Logits
    hou
    -0.74
    ysis
    -0.72
    â̦]
    -0.72
    ipop
    -0.72
    ously
    -0.70
    dal
    -0.65
    iste
    -0.65
    rette
    -0.64
    IGH
    -0.64
    rences
    -0.64
    POSITIVE LOGITS
    _-
    0.86
     namely
    0.70
    yes
    0.64
    ->
    0.63
     albeit
    0.61
    without
    0.60
     aka
    0.60
     Instruments
    0.59
    especially
    0.59
    which
    0.59
    Act Density 0.048%

    No Known Activations