INDEX
Explanations
verb phrases indicating speech or statements made by individuals
New Auto-Interp
Negative Logits
Theſe
-0.92
^(@)
-0.91
Majefty
-0.90
tartalomajánló
-0.88
refour
-0.84
Anſ
-0.83
Efq
-0.83
Diſ
-0.83
ſtate
-0.83
purpoſe
-0.83
POSITIVE LOGITS
Portail
0.64
,
0.52
and
0.52
(
0.51
日
0.50
I
0.43
↵↵
0.43
وله
0.42
0.41
"
0.41
Activations Density 0.004%