INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     ſever
    -0.61
     pleaſure
    -0.60
     purpoſe
    -0.60
     ſtate
    -0.55
     Reſ
    -0.54
     ſta
    -0.54
     ſeveral
    -0.54
     Diſ
    -0.52
     Houſe
    -0.52
     ſche
    -0.52
    POSITIVE LOGITS
     referenties
    0.80
     To
    0.78
     On
    0.77
     At
    0.76
     سكانية
    0.73
     By
    0.72
     Over
    0.72
     go
    0.72
    expandindo
    0.71
     Chwiliwch
    0.70
    Act Density 0.001%

    No Known Activations