INDEX
Explanations
special characters and their repeated patterns
New Auto-Interp
Negative Logits
المعيارى
-0.70
USTIN
-0.62
estimés
-0.61
.
-0.60
-0.60
↵↵
-0.59
(
-0.59
ar
-0.59
,
-0.57
-
-0.57
POSITIVE LOGITS
ainfi
1.17
feroit
1.09
avoient
1.05
auroit
1.03
plufieurs
1.03
auffi
1.02
myſelf
1.00
raiſ
0.97
pouvoit
0.96
juſt
0.95
Activations Density 0.702%