INDEX
    Explanations

    quotes enclosed in quotation marks

    New Auto-Interp
    Negative Logits
     honors
    -0.82
     favor
    -0.82
     honor
    -0.79
     nude
    -0.74
     bunk
    -0.74
     clo
    -0.73
     eligible
    -0.72
     classified
    -0.71
     slam
    -0.71
     grades
    -0.71
    POSITIVE LOGITS
    Therefore
    1.49
    It
    1.45
    However
    1.43
    They
    1.42
    We
    1.40
    Whereas
    1.39
    There
    1.38
    But
    1.36
    Secondly
    1.35
    If
    1.34
    Act Density 0.091%

    No Known Activations