INDEX
Explanations
entities or terms related to digital technology and social media
New Auto-Interp
Negative Logits
`\
-0.15
engl
-0.14
,copy
-0.13
oppins
-0.13
zung
-0.13
Falk
-0.13
br
-0.13
literature
-0.13
\Bridge
-0.13
abal
-0.12
POSITIVE LOGITS
_beh
0.29
beh
0.23
_should
0.22
Beh
0.22
Beh
0.21
do
0.20
behaviour
0.20
.beh
0.20
behavior
0.19
behaviours
0.19
Activations Density 0.006%