INDEX
Explanations
phrases indicating emotional states or reactions
the phrase "I was" in various contexts
New Auto-Interp
Negative Logits
iosyncr
-0.73
izable
-0.67
worthiness
-0.66
negie
-0.66
exploits
-0.66
士
-0.65
ð
-0.64
luence
-0.64
vernment
-0.63
èĢħ
-0.63
POSITIVE LOGITS
amazed
1.17
terrified
1.16
afraid
1.14
ecstatic
1.10
horrified
1.08
scared
1.08
glad
1.07
shocked
1.07
devastated
1.05
tempted
1.05
Activations Density 0.169%