INDEX
Explanations
words related to the concept of structure, either physical or social
words related to structural and cultural significance
New Auto-Interp
Negative Logits
ials
-0.71
Riders
-0.70
Express
-0.69
Delivery
-0.68
ger
-0.66
gers
-0.65
HERO
-0.64
Channel
-0.64
ties
-0.64
raviolet
-0.63
POSITIVE LOGITS
urally
1.14
psey
0.95
pleasing
0.88
conduc
0.83
regenerate
0.81
exting
0.80
distingu
0.79
hematically
0.78
compr
0.76
bane
0.76
Activations Density 0.016%