INDEX
Explanations
instances of conflict or confrontation related to authority figures
New Auto-Interp
Head Attr Weights
0:0.05
1:0.05
2:0.04
3:0.04
4:0.05
5:0.03
6:0.18
7:0.03
8:0.08
9:0.34
10:0.02
11:0.04
Negative Logits
azel
-3.91
ennett
-3.45
Ivan
-3.41
ガ
-3.28
MAC
-3.26
Hazel
-3.21
オ
-3.14
Marsh
-3.13
カ
-3.11
シ
-3.09
POSITIVE LOGITS
Th
6.80
TH
6.44
Th
6.18
Thom
5.69
Thom
5.46
Thur
5.37
TH
5.33
th
5.23
Thai
4.98
Mith
4.88
Activations Density 0.015%