INDEX
Explanations
references to the English language or its representation in text
start of user turn
New Auto-Interp
Negative Logits
StringTokenizer
-0.57
pleaſure
-0.54
houſe
-0.54
myſelf
-0.52
IMPORTED
-0.52
insatz
-0.52
ſch
-0.51
gräns
-0.50
anyahu
-0.50
miniaturka
-0.50
POSITIVE LOGITS
s
0.51
rs
0.48
لينكات
0.43
BS
0.43
AS
0.43
riks
0.43
Gre
0.42
.
0.41
mas
0.41
Trama
0.40
Activations Density 0.000%