INDEX
    Explanations

    content related to free speech and its legal implications

    New Auto-Interp
    Negative Logits
    902
    -0.15
    iju
    -0.14
    gett
    -0.14
     Wallpaper
    -0.14
    811
    -0.14
    окол
    -0.14
    loggedin
    -0.14
    fm
    -0.14
    acin
    -0.13
     electr
    -0.13
    POSITIVE LOGITS
     speech
    0.51
     Speech
    0.46
     First
    0.44
    speech
    0.42
    Speech
    0.42
     free
    0.39
    peech
    0.36
     freedom
    0.34
    First
    0.32
    free
    0.31
    Act Density 0.197%

    No Known Activations