INDEX
Explanations
numerical representations and values
New Auto-Interp
Negative Logits
’
-0.57
-
-0.54
-0.47
(
-0.46
Re
-0.45
'
-0.45
↵↵
-0.44
der
-0.44
a
-0.44
,
-0.44
POSITIVE LOGITS
Савезне
1.04
ſelves
1.01
ſtate
0.99
purpoſe
0.98
ſelf
0.98
iſt
0.94
neſs
0.93
auffi
0.93
kaarangay
0.92
myſelf
0.92
Activations Density 0.344%