INDEX
Explanations
terms related to three-dimensional representations or illusions
terms related to dimensionality in spatial representations
New Auto-Interp
Negative Logits
INST
-0.73
ICA
-0.70
AV
-0.69
ALT
-0.67
UES
-0.65
raltar
-0.64
MU
-0.64
exec
-0.64
creen
-0.62
CHAT
-0.61
POSITIVE LOGITS
imensional
1.33
dimensional
1.15
dimensional
1.00
ikuman
0.92
ĸļ
0.91
dimension
0.85
ILCS
0.77
ynamic
0.76
hyde
0.76
dimension
0.75
Activations Density 0.010%