INDEX
    Explanations

    references to legal restrictions and implications surrounding free speech

    New Auto-Interp
    Negative Logits
    902
    -0.16
     Decom
    -0.15
    обÑĢаз
    -0.15
     Wallpaper
    -0.15
    окол
    -0.14
    imitive
    -0.14
    elu
    -0.14
    lique
    -0.14
    loggedin
    -0.13
    ernational
    -0.13
    POSITIVE LOGITS
     speech
    0.55
     First
    0.53
     Speech
    0.50
    Speech
    0.46
    speech
    0.45
     free
    0.42
     freedom
    0.41
    First
    0.39
    peech
    0.39
    free
    0.35
    Act Density 0.237%

    No Known Activations