INDEX
Explanations
Twitter handles and website URLs
abbreviations or acronyms often related to digital media or technology contexts
New Auto-Interp
Negative Logits
ATIVE
-0.83
LIFE
-0.77
FUL
-0.73
IZE
-0.70
CONTROL
-0.68
afore
-0.68
ALLY
-0.67
Zed
-0.67
GROUND
-0.64
SHE
-0.64
POSITIVE LOGITS
cs
1.09
cc
1.01
gs
0.97
fp
0.97
fs
0.97
cb
0.96
ctr
0.95
RN
0.95
cd
0.95
pd
0.94
Activations Density 0.135%