INDEX
Explanations
questions that begin with "What" or "Which"
New Auto-Interp
Negative Logits
queſta
-0.94
OGND
-0.88
Autoritní
-0.76
kasarigan
-0.73
houſe
-0.73
PeEnEo
-0.72
ſtanding
-0.71
Rujuakan
-0.71
autorytatywna
-0.71
ſei
-0.67
POSITIVE LOGITS
&
0.35
Marks
0.33
0.32
marks
0.32
Investing
0.31
-
0.31
Garden
0.30
the
0.30
Marca
0.30
Rub
0.30
Activations Density 0.101%