INDEX
Explanations
the letter 'L' in various contexts
New Auto-Interp
Negative Logits
avoient
-0.89
étoient
-0.88
desmotivaciones
-0.87
enfans
-0.85
zijne
-0.82
auroit
-0.79
étoit
-0.79
ainfi
-0.75
feroit
-0.75
miniaturka
-0.75
POSITIVE LOGITS
l
0.90
L
0.85
ل
0.77
L
0.75
ل
0.75
cancel
0.73
ล
0.69
ल
0.68
Л
0.65
ల
0.63
Activations Density 0.309%