INDEX
Explanations
repeated phrases related to reflection and introspection
New Auto-Interp
Negative Logits
IsContent
-0.70
nahilalakip
-0.69
extAlignment
-0.68
iNdEx
-0.68
providedIn
-0.67
quiera
-0.66
ویکیپدیای
-0.66
expandindo
-0.66
saites
-0.65
BufferException
-0.65
POSITIVE LOGITS
thinking
1.11
thoughts
1.04
think
1.02
thought
1.00
THINK
0.99
Thinking
0.98
thought
0.98
Thinking
0.98
Think
0.98
consideration
0.97
Activations Density 0.142%