INDEX
Explanations
descriptions or mentions of awe, amazement, or admiration within a variety of contexts
New Auto-Interp
Negative Logits
Luxem
-0.65
cision
-0.63
Gemini
-0.63
å§«
-0.62
士
-0.62
phrine
-0.62
xual
-0.59
Kaepernick
-0.57
iod
-0.56
Feld
-0.56
POSITIVE LOGITS
akens
1.04
akening
0.90
kward
0.88
erness
0.79
iring
0.74
arak
0.73
ards
0.72
aw
0.72
atche
0.72
orld
0.72
Activations Density 10.883%