INDEX
Explanations
beginning, end, or boundary
New Auto-Interp
Negative Logits
THE
0.41
найбіль
0.40
The
0.39
the
0.38
teh
0.37
Many
0.36
Included
0.35
Only
0.35
Rarely
0.35
найбільш
0.34
POSITIVE LOGITS
outset
0.65
same
0.56
beginning
0.52
brink
0.49
end
0.48
same
0.47
equator
0.47
confines
0.43
guise
0.43
cusp
0.42
Activations Density 0.043%