INDEX
Explanations
phrases or words related to questioning authority
New Auto-Interp
Head Attr Weights
0:0.08
1:0.08
2:0.09
3:0.07
4:0.08
5:0.09
6:0.10
7:0.07
8:0.08
9:0.07
10:0.07
11:0.07
Negative Logits
\<
-2.91
ghazi
-2.63
裏�
-2.55
"$:/
-2.54
etheless
-2.52
>[
-2.50
GUI
-2.45
Oo
-2.42
incompetence
-2.36
mildly
-2.31
POSITIVE LOGITS
ierce
2.50
Trader
2.49
unin
2.44
Naomi
2.40
Quantity
2.36
adish
2.34
Mutual
2.31
asin
2.28
tor
2.26
trade
2.22
Activations Density 0.000%