INDEX
Explanations
words related to technological development and experimentation
New Auto-Interp
Negative Logits
shock
-0.76
gee
-0.72
perty
-0.69
chy
-0.63
Haram
-0.63
clad
-0.62
pox
-0.62
pton
-0.61
pload
-0.60
BOX
-0.60
POSITIVE LOGITS
ruct
1.22
ream
1.22
reet
1.18
rophe
1.16
ophe
1.16
ensibly
1.16
rict
1.13
rike
1.11
opher
1.08
orian
1.03
Activations Density 0.883%