INDEX
Explanations
phrases related to public concern or societal issues
New Auto-Interp
Negative Logits
Sounds
-0.15
esk
-0.15
lef
-0.15
utter
-0.14
sounds
-0.14
ialias
-0.14
aidu
-0.14
Sounds
-0.14
oder
-0.14
ntax
-0.13
POSITIVE LOGITS
success
0.17
apes
0.17
proof
0.16
oste
0.15
lessons
0.15
velt
0.15
succeeded
0.15
lesson
0.15
proof
0.14
ajaran
0.14
Activations Density 0.009%