INDEX
Explanations
_phrases indicating a strong opinion or judgment._
phrases that convey the act of stating or expressing thoughts
New Auto-Interp
Negative Logits
panic
-0.65
ascript
-0.62
notebook
-0.61
Frie
-0.59
catentry
-0.59
fram
-0.58
infl
-0.58
ty
-0.57
aspx
-0.56
captcha
-0.54
POSITIVE LOGITS
goodbye
0.97
nothing
0.97
farewell
0.78
nothing
0.70
INGS
0.68
aloud
0.67
hern
0.67
hello
0.66
hem
0.65
Goodbye
0.65
Activations Density 0.056%