INDEX
Explanations
phrases related to aspirations and values
concepts related to social issues and community dynamics
New Auto-Interp
Negative Logits
ppa
-0.55
limb
-0.55
snaps
-0.54
Volunte
-0.54
eb
-0.53
database
-0.53
iken
-0.52
pm
-0.52
iff
-0.51
Pastebin
-0.51
POSITIVE LOGITS
ynchronous
0.67
cum
0.62
within
0.60
WithNo
0.60
embodied
0.58
arently
0.58
whose
0.58
otten
0.57
nonetheless
0.57
comed
0.57
Activations Density 0.419%