INDEX
Explanations
phrases related to conflict or confrontation
commands and actions related to urgency or danger
New Auto-Interp
Negative Logits
etheless
-0.87
xtap
-0.81
resil
-0.72
prisingly
-0.71
eatures
-0.70
obser
-0.66
ModLoader
-0.66
ibaba
-0.65
ailability
-0.65
aples
-0.64
POSITIVE LOGITS
!"
1.71
!'"
1.63
!".
1.63
!",
1.62
!'
1.52
'"
1.46
..."
1.45
â̦"
1.45
,'"
1.45
!!"
1.44
Activations Density 0.378%