INDEX
Explanations
words related to strong emotional reactions or intensity
words related to significant legal terms and conditions
New Auto-Interp
Negative Logits
spr
-0.94
sweet
-0.78
jack
-0.74
lance
-0.70
lesi
-0.70
FFFF
-0.68
nah
-0.66
els
-0.66
Spr
-0.66
eming
-0.66
POSITIVE LOGITS
metic
0.94
atically
0.84
guiActiveUn
0.84
ixel
0.80
othing
0.80
othe
0.78
captcha
0.77
ographic
0.74
ancest
0.74
orical
0.71
Activations Density 0.067%