INDEX
Explanations
references to influential figures and their impacts on society
New Auto-Interp
Negative Logits
iſt
-1.07
ſind
-1.00
itſelf
-0.98
.³
-0.93
ſelf
-0.92
numerusform
-0.90
་་
-0.90
.",
-0.88
AppColors
-0.87
</caption>
-0.87
POSITIVE LOGITS
stuff
0.92
maybe
0.81
I
0.80
kinda
0.77
my
0.76
mierda
0.73
наверное
0.73
crappy
0.73
とか
0.72
&
0.71
Activations Density 2.234%