INDEX
Explanations
phrases related to personal experiences and emotional reflections
Positive sentiment, often with intensifiers
swear words and negative descriptors
New Auto-Interp
Negative Logits
}],
-0.77
hiszen
-0.68
محفوظة
-0.65
DoubleQuotes
-0.65
/>';
-0.64
]-->
-0.63
/>";
-0.63
//----
-0.62
"}\
-0.62
++];
-0.62
POSITIVE LOGITS
stupid
1.07
fucking
1.05
stupidly
1.03
goddamn
1.01
dumbass
0.93
freaking
0.93
idiot
0.92
shitty
0.92
annoying
0.91
FUCKING
0.91
Activations Density 1.007%