INDEX
Explanations
expressions starting with the phrase "What a" followed by an adjective or descriptor
New Auto-Interp
Negative Logits
arent
-0.82
gans
-0.81
interstitial
-0.81
yip
-0.80
aimon
-0.78
Topics
-0.78
aspers
-0.77
icans
-0.76
rences
-0.76
◼
-0.75
POSITIVE LOGITS
wrench
0.85
nause
0.80
shame
0.79
miserable
0.75
difference
0.75
sad
0.74
unemploy
0.74
evening
0.73
disgusting
0.73
flawed
0.72
Activations Density 0.028%