INDEX
Explanations
expressions indicating major problems or challenges
New Auto-Interp
Negative Logits
ut
-0.43
-0.40
tot
-0.40
lo
-0.37
read
-0.37
infine
-0.36
plus
-0.35
his
-0.35
Bar
-0.35
lop
-0.35
POSITIVE LOGITS
Monfieur
1.03
Theſe
1.02
Efq
1.02
becauſe
0.99
Jefus
0.98
Majefty
0.98
\{\\0.97
виправивши
0.94
purpoſe
0.94
ſtate
0.93
Activations Density 0.162%