INDEX
Explanations
phrases dealing with self-perception and existential reflection
New Auto-Interp
Negative Logits
⌐
-0.50
hassee
-0.50
forChild
-0.49
__":
-0.48
manera
-0.47
baric
-0.46
WithFormat
-0.45
ligiloj
-0.45
AxisAlignment
-0.45
ritch
-0.44
POSITIVE LOGITS
what
3.67
what
2.67
What
2.23
What
2.06
WHAT
1.92
ceea
1.84
hvad
1.80
WHAT
1.72
آنچه
1.70
ciò
1.70
Activations Density 2.027%