INDEX
Explanations
descriptions of how well things match or align with particular criteria
New Auto-Interp
Negative Logits
שוליים
-0.86
تضيفلها
-0.81
الحره
-0.81
انجليز
-0.79
Efq
-0.78
المشاركات
-0.78
Shaksp
-0.76
cagon
-0.76
ρίς
-0.73
uksessa
-0.73
POSITIVE LOGITS
snug
0.77
Fits
0.73
Fits
0.69
fits
0.64
fits
0.63
Fitz
0.56
fit
0.55
into
0.54
FIT
0.54
fitting
0.54
Activations Density 0.081%