INDEX
Explanations
words related to criticism or evaluating something negatively
instances of the letter 'r'
New Auto-Interp
Negative Logits
Sons
-0.74
cardio
-0.66
hers
-0.64
EntityItem
-0.64
ROM
-0.64
Lobby
-0.60
mine
-0.60
ours
-0.60
rope
-0.59
SOS
-0.58
POSITIVE LOGITS
ipples
1.39
ambling
1.33
ifling
1.33
umbling
1.28
agged
1.27
iddle
1.27
attle
1.26
iddles
1.26
haps
1.23
angers
1.23
Activations Density 0.015%