INDEX
Explanations
words related to human rights and social justice issues
New Auto-Interp
Head Attr Weights
0:0.03
1:0.02
2:0.05
3:0.06
4:0.03
5:0.38
6:0.06
7:0.01
8:0.15
9:0.08
10:0.05
11:0.03
Negative Logits
jug
-1.50
lives
-1.46
stable
-1.43
Starship
-1.39
ewater
-1.36
racuse
-1.35
Host
-1.33
Unic
-1.30
Helic
-1.29
utable
-1.29
POSITIVE LOGITS
�
1.65
echo
1.64
arte
1.56
draft
1.54
forums
1.53
enko
1.53
include
1.52
course
1.47
ax
1.44
�
1.43
Activations Density 0.130%