INDEX
Explanations
words related to loss or negative events
instances of loss or the word "lost" in various contexts
New Auto-Interp
Negative Logits
sis
-0.73
osi
-0.68
gon
-0.66
mens
-0.65
onomic
-0.62
Janeiro
-0.61
oin
-0.60
heid
-0.60
primed
-0.59
agher
-0.59
POSITIVE LOGITS
luster
0.80
miser
0.74
aversion
0.72
souls
0.70
sight
0.70
ittens
0.69
bite
0.68
touch
0.68
esses
0.67
itive
0.67
Activations Density 0.032%