INDEX
Explanations
words related to fear and apprehension
New Auto-Interp
Negative Logits
kers
-0.17
ilter
-0.16
iferay
-0.16
anness
-0.15
ependency
-0.15
quirer
-0.15
rane
-0.15
gers
-0.15
kits
-0.14
arian
-0.14
POSITIVE LOGITS
lessly
0.36
hã
0.29
fully
0.26
mong
0.26
fulness
0.23
ful
0.23
Fear
0.21
æĢĸ
0.20
Fear
0.19
less
0.19
Activations Density 0.016%