INDEX
Explanations
references to social media or digital media roles
New Auto-Interp
Negative Logits
referenced
-0.16
edor
-0.15
backstory
-0.15
malar
-0.15
unlikely
-0.15
referencing
-0.14
Regardless
-0.14
ocket
-0.14
Moo
-0.13
backlash
-0.13
POSITIVE LOGITS
occupation
0.27
Occupation
0.22
occupation
0.22
occupations
0.22
occupied
0.21
(“
0.21
–
0.21
occup
0.20
Occup
0.20
Occup
0.19
Activations Density 0.003%