INDEX
Explanations
connections to political themes and significant changes in a narrative
New Auto-Interp
Negative Logits
çĽ£
-0.16
conde
-0.15
undra
-0.14
еÑĤи
-0.14
rowable
-0.14
lam
-0.13
.inputs
-0.13
ÑĢиÑĩ
-0.13
astos
-0.13
коном
-0.13
POSITIVE LOGITS
UDA
0.15
aby
0.15
Ta
0.14
ogui
0.14
884
0.14
Dodd
0.14
Telegraph
0.13
acades
0.13
heats
0.13
thick
0.13
Activations Density 0.236%