INDEX
Explanations
significant nouns or phrases that denote concepts or items
New Auto-Interp
Negative Logits
rsa
-0.15
_stream
-0.15
arching
-0.14
Worksheet
-0.14
inds
-0.14
flu
-0.14
uz
-0.13
STREAM
-0.13
indo
-0.13
endo
-0.13
POSITIVE LOGITS
deb
0.22
ibus
0.16
auer
0.15
ä½
0.15
Ïģαν
0.15
Universal
0.15
Sund
0.15
Fry
0.15
abella
0.14
ubb
0.14
Activations Density 0.001%