INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     recorded
    -0.73
    aling
    -0.71
     industry
    -0.58
     Industry
    -0.56
    ALING
    -0.54
    Industry
    -0.49
     INDUSTRY
    -0.48
     des
    -0.45
    -0.45
    bre
    -0.44
    POSITIVE LOGITS
     pleaſure
    1.22
    ſelf
    1.20
     houſe
    1.18
     Efq
    1.13
     myſelf
    1.12
     iſt
    1.11
     ſta
    1.10
     ſche
    1.09
     ―――――
    1.08
     Monfieur
    1.07
    Act Density 0.135%

    No Known Activations