INDEX
Explanations
words related to costs, social interactions, and various forms of measurement
New Auto-Interp
Head Attr Weights
0:0.02
1:0.02
2:0.08
3:0.06
4:0.08
5:0.03
6:0.50
7:0.05
8:0.03
9:0.02
10:0.03
11:0.04
Negative Logits
ALSE
-1.35
Brush
-1.30
CHAT
-1.22
ゴン
-1.22
DAY
-1.13
arenthood
-1.13
MAS
-1.11
liner
-1.11
Rouge
-1.10
algia
-1.10
POSITIVE LOGITS
erous
1.69
prosec
1.51
confir
1.49
rated
1.49
itious
1.48
resistant
1.43
isable
1.42
Marketable
1.41
achy
1.41
fficient
1.37
Activations Density 0.043%