INDEX
Explanations
phrases or expressions that indicate a contrast or exception
New Auto-Interp
Head Attr Weights
0:0.06
1:0.05
2:0.08
3:0.08
4:0.05
5:0.07
6:0.05
7:0.31
8:0.06
9:0.04
10:0.06
11:0.04
Negative Logits
payday
-2.80
Companies
-2.80
Flavoring
-2.79
wealthier
-2.75
casinos
-2.75
casino
-2.68
wealthy
-2.68
pharmaceutical
-2.59
illac
-2.59
riches
-2.58
POSITIVE LOGITS
LET
2.68
CHA
2.67
PsyNetMessage
2.50
ゴ
2.47
Dialogue
2.46
letter
2.44
裏�
2.42
HAM
2.42
isin
2.38
determination
2.35
Activations Density 0.001%