INDEX
Explanations
phrases related to societal issues or discussions
phrases related to societal issues and individual experiences
New Auto-Interp
Negative Logits
$.
-0.77
gobl
-0.75
mathemat
-0.67
showc
-0.67
arrang
-0.65
notor
-0.65
Niet
-0.63
Palestin
-0.62
Cairo
-0.59
traged
-0.58
POSITIVE LOGITS
meanwhile
0.69
¶
0.66
analogy
0.62
requires
0.61
helps
0.61
Community
0.60
Yourself
0.59
entails
0.58
huh
0.57
Cause
0.57
Activations Density 0.866%