INDEX
Explanations
my purpose is to be helpful and harmless
New Auto-Interp
Negative Logits
benzer
0.82
voyez
0.77
parecido
0.76
similar
0.76
Ditto
0.76
مشابه
0.75
simili
0.74
tweaks
0.73
mirip
0.73
OTHER
0.72
POSITIVE LOGITS
日益
1.55
Nowadays
1.54
越来越多的
1.51
nowadays
1.50
Nowadays
1.47
increasingly
1.44
越来越
1.38
zunehm
1.35
越来越多
1.33
incessantly
1.32
Activations Density 0.200%