INDEX
    Explanations

    text related to policy discussion with a focus on regulations and exceptions

    New Auto-Interp
    Negative Logits
     depic
    -0.63
     fameux
    -0.63
     shenan
    -0.62
     Nicolai
    -0.61
     reluct
    -0.60
     intersper
    -0.60
     Bartholo
    -0.60
     McLaugh
    -0.59
     milf
    -0.59
     apprehen
    -0.58
    POSITIVE LOGITS
     rather
    1.82
    rather
    1.68
     instead
    1.49
    instead
    1.41
    Rather
    1.39
     Rather
    1.39
    Instead
    1.27
    而不是
    1.19
     Instead
    1.18
     plutôt
    1.13
    Act Density 0.686%

    No Known Activations