INDEX
Explanations
phrases about gender and authority
New Auto-Interp
Head Attr Weights
0:0.02
1:0.02
2:0.09
3:0.05
4:0.17
5:0.02
6:0.07
7:0.35
8:0.02
9:0.03
10:0.06
11:0.05
Negative Logits
avering
-1.56
accur
-1.53
maintained
-1.53
attribute
-1.51
specifications
-1.50
urance
-1.49
urances
-1.49
definition
-1.48
dimensions
-1.47
formulas
-1.46
POSITIVE LOGITS
DragonMagazine
1.80
Brave
1.57
��
1.56
VIDEOS
1.50
Guest
1.49
Welcome
1.37
Encounter
1.33
Morning
1.32
aneers
1.32
GOODMAN
1.31
Activations Density 0.001%