INDEX
Explanations
numerical data and formatting elements
New Auto-Interp
Negative Logits
;
-0.52
.
-0.50
feeling
-0.48
:
-0.46
?
-0.45
,
-0.44
↵↵
-0.44
felt
-0.44
-
-0.43
.
-0.43
POSITIVE LOGITS
Efq
0.94
autorytatywna
0.83
Phry
0.80
ſelf
0.78
himſelf
0.75
themſelves
0.74
myſelf
0.74
iſt
0.71
ſelves
0.71
Shakspeare
0.70
Activations Density 0.137%