INDEX
    Explanations

    instances of the word "we" in various contexts

    New Auto-Interp
    Negative Logits
     a
    -0.81
    ')],
    -0.81
     the
    -0.79
     "
    -0.77
    ICAGO
    -0.76
     Griffin
    -0.75
    )")
    -0.74
     podjela
    -0.73
    "):
    
    -0.73
    )");
    
    -0.72
    POSITIVE LOGITS
     Thebes
    0.84
     trypto
    0.70
     Jof
    0.69
     enfans
    0.67
     houſe
    0.66
     Howe
    0.65
     Canva
    0.65
     Holocene
    0.65
     heretics
    0.64
     Hereford
    0.64
    Act Density 0.685%

    No Known Activations