INDEX
Explanations
adverbs that express manner, particularly those associated with calmness or subtlety
New Auto-Interp
Negative Logits
top
-0.64
stat
-0.60
見た
-0.57
gemaakt
-0.57
<eos>
-0.55
stop
-0.54
ck
-0.53
rad
-0.53
ud
-0.52
st
-0.52
POSITIVE LOGITS
safely
1.35
successfully
1.35
calmly
1.31
يتيمه
1.26
successfully
1.26
quietly
1.24
Safely
1.24
peacefully
1.22
Successfully
1.22
Successfully
1.21
Activations Density 0.272%