INDEX
Explanations
phrases related to emphasizing a point or reacting to a statement
phrases or expressions that convey a sense of disbelief or surprise
New Auto-Interp
Negative Logits
iatric
-0.79
oreal
-0.73
iliated
-0.73
ieg
-0.72
oufl
-0.71
strate
-0.70
ettel
-0.68
Contents
-0.68
ivil
-0.67
éĸ
-0.67
POSITIVE LOGITS
!:
0.79
Kev
0.77
yourselves
0.72
Suns
0.67
Cory
0.66
analogy
0.63
Sco
0.62
!'"
0.61
Cohn
0.60
Krish
0.60
Activations Density 0.169%