INDEX
Explanations
terms related to affection, admiration, or preference towards something or someone
New Auto-Interp
Negative Logits
avis
-0.61
Noon
-0.61
ilib
-0.60
transitions
-0.58
Degree
-0.57
å¸
-0.56
disruptive
-0.56
cumbers
-0.56
Extras
-0.55
briefings
-0.55
POSITIVE LOGITS
passionately
0.92
rend
0.85
wart
0.84
atical
0.83
joy
0.83
lead
0.81
leader
0.80
uay
0.79
uncond
0.79
bear
0.78
Activations Density 2.471%