INDEX
Explanations
concepts related to universal declarations and rights
New Auto-Interp
Negative Logits
iw
-0.20
lessly
-0.17
aben
-0.16
leton
-0.15
lessness
-0.15
esar
-0.15
Ïĥκ
-0.14
alo
-0.14
yonel
-0.14
eo
-0.14
POSITIVE LOGITS
izing
0.22
ize
0.22
ized
0.22
ist
0.21
ities
0.21
ised
0.20
ists
0.19
istic
0.18
mente
0.18
ization
0.18
Activations Density 0.014%