INDEX
Explanations
words related to caring or concern for others
expressions of concern or caring towards others
New Auto-Interp
Negative Logits
jam
-0.64
ãĥĪ
-0.64
antha
-0.60
UES
-0.59
captcha
-0.58
Assembly
-0.58
NVIDIA
-0.56
uti
-0.54
GV
-0.54
scrambled
-0.54
POSITIVE LOGITS
lessly
1.24
about
1.15
passionately
1.09
deeply
1.04
ABOUT
1.01
lessness
0.96
ened
0.91
about
0.86
enough
0.85
taker
0.84
Activations Density 0.027%