INDEX
Explanations
references to specific online platforms or technologies
references to social media and digital communication
New Auto-Interp
Negative Logits
ategories
-0.75
bernatorial
-0.74
ONSORED
-0.74
ilogy
-0.71
ufact
-0.66
urd
-0.66
emetery
-0.66
jriwal
-0.66
licts
-0.65
rodu
-0.65
POSITIVE LOGITS
sparing
0.96
wisely
0.90
pseudonym
0.81
levers
0.76
techniques
0.76
tools
0.75
analogy
0.74
gimm
0.73
metaphors
0.72
extensively
0.72
Activations Density 0.612%