INDEX
Explanations
phrases related to categories of discrimination and animal rights
New Auto-Interp
Head Attr Weights
0:0.02
1:0.01
2:0.09
3:0.08
4:0.13
5:0.04
6:0.13
7:0.27
8:0.04
9:0.04
10:0.05
11:0.05
Negative Logits
govtrack
-1.63
favorites
-1.55
priority
-1.52
successes
-1.50
fman
-1.41
planners
-1.40
favourites
-1.39
directing
-1.38
liest
-1.38
enroll
-1.38
POSITIVE LOGITS
Animals
1.68
Beast
1.57
Sacred
1.54
Mamm
1.51
Terrorism
1.51
Corruption
1.50
Violence
1.50
Atmosp
1.48
Memory
1.44
Cruel
1.44
Activations Density 0.003%