INDEX
Explanations
words related to showing concern or attention towards something
references to caring or concern for others
New Auto-Interp
Negative Logits
GV
-0.66
jam
-0.65
UES
-0.64
ãĥĪ
-0.63
antha
-0.62
Loading
-0.59
akedown
-0.58
Upper
-0.57
Assembly
-0.56
ãĥ³ãĤ¸
-0.55
POSITIVE LOGITS
passionately
1.17
lessly
1.12
deeply
0.98
about
0.94
taker
0.90
lessness
0.90
der
0.89
ABOUT
0.85
ened
0.81
ately
0.80
Activations Density 0.032%