INDEX
Explanations
words and phrases related to themes of loss and resilience
New Auto-Interp
Negative Logits
®,
-0.16
@@↵
-0.14
bare
-0.14
åĸĦ
-0.14
XX
-0.13
:maj
-0.13
asley
-0.13
...\
-0.13
...',
-0.13
xxxx
-0.13
POSITIVE LOGITS
!
0.32
!/
0.30
!,
0.29
!.↵↵
0.28
!=
0.25
!");č↵
0.24
!("0.24
!,↵
0.24
!(
0.23
!\
0.23
Activations Density 0.135%