INDEX
    Explanations

    the word "one" in various contexts

    New Auto-Interp
    Negative Logits
    uits
    -0.89
    its
    -0.81
    ourses
    -0.77
    models
    -0.77
    ooks
    -0.74
    folk
    -0.73
     Parties
    -0.71
    acements
    -0.71
    ories
    -0.71
    items
    -0.71
    POSITIVE LOGITS
     apiece
    0.96
     unnamed
    0.79
     hundred
    0.79
     else
    0.68
     person
    0.67
     single
    0.67
     unidentified
    0.67
     Hundred
    0.67
     observer
    0.66
     sided
    0.65
    Act Density 0.054%

    No Known Activations