INDEX
Explanations
articles such as "the" and "a"
into the [noun/concept]
New Auto-Interp
Negative Logits
ähteet
-0.76
zijne
-0.74
enfans
-0.73
ſta
-0.72
avoient
-0.72
dezelve
-0.69
keber
-0.69
zoude
-0.68
calcetines
-0.68
ſal
-0.66
POSITIVE LOGITS
into
1.13
INTO
1.00
Into
0.97
into
0.91
Into
0.90
INTO
0.74
in
0.56
onto
0.56
isIn
0.55
inta
0.54
Activations Density 0.031%