INDEX
Explanations
interjections and expressions of personal opinion or emotion
New Auto-Interp
Negative Logits
]--;
-0.67
😞
-0.60
stantial
-0.58
{}\-0.57
'<?
-0.56
}\]
-0.54
vast
-0.53
::*;
-0.53
😔
-0.52
requently
-0.52
POSITIVE LOGITS
Plus
1.07
Plus
1.04
PLUS
0.82
Enjoy
0.80
Bonus
0.79
plus
0.78
Bonus
0.78
PLUS
0.75
bonus
0.74
plus
0.73
Activations Density 0.119%