INDEX
    Explanations

    phrases that indicate specific conditions or circumstances when something occurs

    New Auto-Interp
    Negative Logits
     purpoſe
    -0.95
     himſelf
    -0.85
     ſtate
    -0.84
     houſe
    -0.80
     ſame
    -0.76
     Jefus
    -0.76
     rodríguez
    -0.73
     auroit
    -0.73
     ſy
    -0.72
     Majefty
    -0.71
    POSITIVE LOGITS
     when
    1.09
     WHEN
    1.01
     When
    0.99
    when
    0.98
    WHEN
    0.93
    When
    0.91
     they
    0.88
     we
    0.84
     när
    0.82
    cuando
    0.78
    Act Density 0.102%

    No Known Activations