INDEX
Explanations
words related to physical locations or structures
words related to "core" or "core concepts."
New Auto-Interp
Negative Logits
endor
-0.85
Order
-0.68
Honour
-0.66
enstein
-0.65
Sacrament
-0.65
leck
-0.65
Guilty
-0.65
Disorder
-0.64
hirt
-0.64
hran
-0.64
POSITIVE LOGITS
nered
1.07
vette
0.89
fman
0.87
rection
0.85
bons
0.81
pling
0.81
ption
0.79
nea
0.79
ocular
0.78
aling
0.77
Activations Density 0.027%