INDEX
    Explanations

    phrases related to explanations or justifications

    mentions of the word "reason" in various contexts

    New Auto-Interp
    Negative Logits
    boro
    -0.66
    agra
    -0.66
    KY
    -0.65
     helicop
    -0.65
     likeness
    -0.64
     Riders
    -0.61
     needle
    -0.61
    oba
    -0.60
    inis
    -0.60
    NetMessage
    -0.59
    POSITIVE LOGITS
    abl
    1.01
     reasons
    0.95
    neum
    0.89
     sake
    0.80
    reason
    0.79
    mpeg
    0.75
    resy
    0.75
    asons
    0.74
     unrelated
    0.72
     purposes
    0.71
    Act Density 0.028%

    No Known Activations