INDEX
Explanations
phrases expressing strong emotions or opinions
expressions of personal feelings and states of being
New Auto-Interp
Negative Logits
Sierra
-0.67
CE
-0.62
Analysis
-0.61
Shack
-0.61
Mand
-0.61
tariffs
-0.60
Defense
-0.60
toxin
-0.60
Leave
-0.60
Quotes
-0.57
POSITIVE LOGITS
myself
0.80
ãĥĥãĤ¯
0.79
ãĤ´
0.79
ãĥ¼ãĥ«
0.77
ðŁij
0.74
ograp
0.73
daq
0.73
querque
0.71
wondering
0.70
ðŁij
0.69
Activations Density 0.075%