INDEX
Explanations
words related to being impressed
expressions of admiration or being impressed
New Auto-Interp
Negative Logits
©¶æ
-0.71
rules
-0.70
violence
-0.70
rama
-0.69
turn
-0.66
access
-0.64
andro
-0.63
nuclear
-0.63
diverted
-0.63
portion
-0.63
POSITIVE LOGITS
impressed
0.92
mented
0.77
oresc
0.76
resemb
0.74
surprised
0.74
sson
0.73
MENTS
0.72
ulously
0.71
iated
0.69
ochet
0.67
Activations Density 0.012%