INDEX
Explanations
instances of findings or observations related to research results
New Auto-Interp
Negative Logits
انتهای
-0.44
yntaxException
-0.42
nipeg
-0.41
Roskov
-0.41
ואת
-0.40
Ato
-0.39
dehy
-0.38
Italijani
-0.38
sobriety
-0.38
Chy
-0.38
POSITIVE LOGITS
Found
0.94
found
0.92
Found
0.86
FOUND
0.71
found
0.71
FOUND
0.68
observed
0.61
Observed
0.61
encontrado
0.56
ditemukan
0.56
Activations Density 0.049%