INDEX
    Explanations

    words indicating quantities or articles in various forms

    New Auto-Interp
    Negative Logits
     houſe
    -0.80
     myſelf
    -0.78
     fumée
    -0.76
     poussière
    -0.75
     himſelf
    -0.74
     Phry
    -0.72
     Assyrian
    -0.72
     Majefty
    -0.72
     themſelves
    -0.71
     paille
    -0.70
    POSITIVE LOGITS
     a
    0.96
    {}",
    0.95
     large
    0.90
    ]))
    
    0.88
     few
    0.86
     great
    0.85
     huge
    0.84
    "):
    
    0.82
     hundred
    0.81
     particular
    0.80
    Act Density 0.009%

    No Known Activations