INDEX
Explanations
special formatting or symbolic representations in the text
New Auto-Interp
Negative Logits
-0.61
tania
-0.56
hal
-0.53
«
-0.52
I
-0.51
ρης
-0.51
-0.51
hout
-0.49
stara
-0.47
baye
-0.47
POSITIVE LOGITS
itſelf
1.07
1.02
للمعارف
0.92
betweenstory
0.92
neceff
0.89
AssemblyCulture
0.88
Reſ
0.87
Majefty
0.86
himſelf
0.86
Efq
0.84
Activations Density 0.166%