INDEX
Explanations
phrases expressing strong emotions or opinions
phrases that emphasize assertion or opinion
New Auto-Interp
Negative Logits
Cruiser
-0.74
ibal
-0.62
swick
-0.61
fingert
-0.58
Lauder
-0.58
artment
-0.56
Mechan
-0.56
allery
-0.56
Liver
-0.55
ensing
-0.54
POSITIVE LOGITS
goodbye
1.08
aloud
1.02
loudly
0.93
farewell
0.84
bluff
0.82
Goodbye
0.76
louder
0.75
hello
0.74
ript
0.72
loud
0.72
Activations Density 0.263%