INDEX
    Explanations

    phrases related to safety and safe spaces

    New Auto-Interp
    Negative Logits
    issance
    -0.92
    Lenin
    -0.74
     disproportion
    -0.73
     intensify
    -0.73
    ithing
    -0.67
    acio
    -0.67
    ribune
    -0.65
    ennial
    -0.65
     nostalg
    -0.64
     favor
    -0.64
    POSITIVE LOGITS
     safe
    0.88
     perimeter
    0.77
     safer
    0.77
     Safe
    0.77
     precautions
    0.76
     safest
    0.75
    Safe
    0.74
     havens
    0.73
     unprotected
    0.73
     Saf
    0.72
    Act Density 0.077%

    No Known Activations