INDEX
Explanations
occurrences of various emotions and personal connections
New Auto-Interp
Negative Logits
nation
-0.16
ymax
-0.15
emme
-0.15
<<
-0.15
Nation
-0.14
lots
-0.14
erland
-0.14
rv
-0.14
figure
-0.13
Maison
-0.13
POSITIVE LOGITS
whole
0.26
whole
0.22
blasted
0.19
Whole
0.19
ole
0.17
nore
0.17
ilden
0.17
Whole
0.17
ol
0.16
å®¶ä¼Ļ
0.16
Activations Density 0.329%