INDEX
    Explanations

    the word "mean" used in various contexts

    phrases that indicate qualifications or clarifications about previous statements

    New Auto-Interp
    Negative Logits
    ngth
    -0.82
    odan
    -0.75
    hiba
    -0.73
     sidx
    -0.73
    albeit
    -0.68
    figured
    -0.67
    alez
    -0.65
    otype
    -0.65
    utenberg
    -0.63
    antha
    -0.63
    POSITIVE LOGITS
     anything
    1.00
     anymore
    0.98
     anyone
    0.90
     nor
    0.89
     anybody
    0.80
     everyone
    0.79
     any
    0.77
     everything
    0.77
     necessarily
    0.76
     abandoning
    0.74
    Act Density 0.080%

    No Known Activations