INDEX
Explanations
social media handles preceded by '@'
New Auto-Interp
Negative Logits
specificity
-0.78
vessels
-0.75
outnumbered
-0.70
lungs
-0.69
favor
-0.68
consolidation
-0.68
conver
-0.68
estab
-0.68
conformity
-0.67
favors
-0.67
POSITIVE LOGITS
thereal
1.13
jon
1.06
Coach
1.03
jac
1.01
username
0.98
meg
0.94
sch
0.93
sam
0.92
jay
0.91
nat
0.91
Activations Density 0.023%