INDEX
Explanations
concepts related to psychological and emotional states or conditions
New Auto-Interp
Head Attr Weights
0:0.03
1:0.05
2:0.07
3:0.03
4:0.02
5:0.12
6:0.09
7:0.13
8:0.08
9:0.05
10:0.14
11:0.13
Negative Logits
ギ
-1.21
ּ
-1.21
dit
-1.14
Olympia
-1.10
keleton
-1.08
Adin
-1.07
DERR
-1.07
thia
-1.06
convict
-1.04
�
-1.03
POSITIVE LOGITS
respectively
1.90
oven
1.21
depending
1.21
respective
1.15
Characters
1.13
reviewed
1.13
These
1.11
ergy
1.09
ependent
1.08
collectively
1.07
Activations Density 0.343%