INDEX
Explanations
emotional and relational dynamics, particularly around loss and caring actions
New Auto-Interp
Negative Logits
praticamente
-0.70
Darn
-0.70
Aufgrund
-0.68
sumamente
-0.68
viamente
-0.67
äußerst
-0.65
extremadamente
-0.64
Asimismo
-0.63
)!
-0.63
např
-0.63
POSITIVE LOGITS
fucking
0.71
fucking
0.66
noons
0.66
genstein
0.65
fucked
0.64
fuck
0.62
stillness
0.58
fuck
0.58
nameless
0.56
رشف
0.55
Activations Density 0.660%