INDEX
Explanations
phrases that involve the concept of kindness
New Auto-Interp
Negative Logits
cher
-0.17
ç¼
-0.16
à¥Ģण
-0.15
ximo
-0.15
ánu
-0.15
çon
-0.15
ñana
-0.15
ional
-0.15
sse
-0.15
quia
-0.14
POSITIVE LOGITS
red
0.44
erg
0.36
ergarten
0.35
ling
0.34
led
0.34
reds
0.34
gom
0.32
heart
0.31
-hearted
0.28
RED
0.26
Activations Density 0.032%