INDEX
    Explanations

    references to moral values and agendas in societal contexts

    opinions or perspectives

    abstract nouns and qualities

    New Auto-Interp
    Negative Logits
    enderror
    -0.84
     betweenstory
    -0.78
    %")
    -0.76
    serviceWorker
    -0.72
     })}
    -0.71
    tableFuture
    -0.71
     */
    
    
    -0.71
    __":
    
    -0.70
    Демографія
    -0.70
    Viitteet
    -0.69
    POSITIVE LOGITS
     estekak
    0.66
     been
    0.59
     nodig
    0.53
     had
    0.51
    Autoritní
    0.50
    k
    0.48
     Anteil
    0.47
    0.46
     backing
    0.46
     experience
    0.45
    Act Density 0.642%

    No Known Activations