INDEX
Explanations
discourse around political critique and socio-economic issues
New Auto-Interp
Negative Logits
swire
-0.15
isay
-0.14
oreach
-0.14
tempt
-0.13
еÑĢо
-0.13
ospel
-0.13
лÑİд
-0.13
Quyết
-0.13
tùy
-0.13
.feed
-0.13
POSITIVE LOGITS
look
0.44
remember
0.39
recall
0.37
consider
0.36
Look
0.34
remember
0.32
look
0.32
notice
0.31
Remember
0.31
recall
0.31
Activations Density 0.475%