INDEX
Explanations
references to official documents and policies
New Auto-Interp
Head Attr Weights
0:0.02
1:0.01
2:0.07
3:0.04
4:0.03
5:0.04
6:0.02
7:0.06
8:0.03
9:0.05
10:0.44
11:0.12
Negative Logits
Galile
-1.50
vigilante
-1.48
tragedy
-1.44
tragedies
-1.41
vigilant
-1.40
mare
-1.39
pity
-1.38
lyn
-1.34
"]=>
-1.33
vengeance
-1.33
POSITIVE LOGITS
��
2.13
outlining
2.11
detailing
2.09
itled
2.00
titled
1.92
outlines
1.89
enum
1.81
dated
1.78
revised
1.72
handwritten
1.68
Activations Density 0.363%