INDEX
Explanations
words related to excitement or intense emotions
New Auto-Interp
Head Attr Weights
0:0.02
1:0.01
2:0.06
3:0.06
4:0.10
5:0.03
6:0.05
7:0.43
8:0.03
9:0.04
10:0.07
11:0.04
Negative Logits
���
-1.98
ailand
-1.66
�
-1.49
braces
-1.43
thood
-1.42
handc
-1.38
��
-1.36
assad
-1.35
earchers
-1.33
razil
-1.31
POSITIVE LOGITS
bub
1.92
esters
1.57
▓
1.57
juices
1.54
medi
1.53
Rum
1.48
mixture
1.44
bugs
1.43
decay
1.42
depths
1.41
Activations Density 0.001%