INDEX
Explanations
phrases related to human rights
references to human rights
New Auto-Interp
Negative Logits
charger
-0.71
itz
-0.69
oard
-0.68
styles
-0.67
bub
-0.66
upstairs
-0.66
condos
-0.66
charg
-0.66
tariff
-0.66
rentals
-0.66
POSITIVE LOGITS
Human
3.69
Human
3.01
human
2.24
human
2.14
Humans
2.01
HUM
1.65
humans
1.61
Humanity
1.61
Mankind
1.55
humans
1.49
Activations Density 0.015%