INDEX
    Explanations

    phrases indicating a comparison or a choice between two options

    phrases concerning uncertainty or conditional statements

    New Auto-Interp
    Negative Logits
     Pony
    -0.67
     Room
    -0.61
     Rocket
    -0.59
     Symphony
    -0.59
     Chau
    -0.59
    ://
    -0.56
     Romeo
    -0.56
     Trophy
    -0.56
     Cardinal
    -0.56
     Tele
    -0.56
    POSITIVE LOGITS
    Else
    0.99
    acles
    0.96
     else
    0.80
    acle
    0.79
    rogens
    0.75
    nam
    0.73
    odd
    0.70
    rame
    0.69
    rist
    0.68
    rha
    0.67
    Act Density 0.022%

    No Known Activations