INDEX
    Explanations

    phrases indicating attempts or efforts to accomplish something

    New Auto-Interp
    Negative Logits
     myſelf
    -0.80
     itſelf
    -0.79
    ValueGeneration
    -0.74
     pleaſure
    -0.74
     ſtate
    -0.73
     Futura
    -0.71
     Jefus
    -0.70
     himſelf
    -0.68
     occafion
    -0.65
     cauſe
    -0.65
    POSITIVE LOGITS
     attempt
    1.39
     attempts
    1.31
     trying
    1.29
     versucht
    1.23
     tries
    1.22
     Trying
    1.20
     versuchen
    1.18
     tentando
    1.17
    Trying
    1.16
    trying
    1.15
    Act Density 0.120%

    No Known Activations