INDEX
Explanations
words related to exceptional examples or extreme cases
terms related to symbolic representation and embodiment of concepts
New Auto-Interp
Negative Logits
foreseen
-0.96
jab
-0.82
ften
-0.66
swer
-0.65
peer
-0.63
supp
-0.61
leased
-0.60
picking
-0.60
asive
-0.60
heed
-0.60
POSITIVE LOGITS
imum
0.92
embodiment
0.84
virtues
0.81
qualities
0.80
exempl
0.76
archetype
0.75
perfection
0.74
ideal
0.73
example
0.73
excellence
0.73
Activations Density 0.187%