INDEX
Explanations
references to "doc" or variations of it, indicating a focus on documentation or documents
New Auto-Interp
Negative Logits
ized
-0.19
är
-0.15
Veteran
-0.15
IZED
-0.15
ively
-0.15
irms
-0.15
hed
-0.15
ately
-0.15
fall
-0.15
ers
-0.14
POSITIVE LOGITS
uments
0.32
umen
0.26
ument
0.22
lava
0.19
/doc
0.19
Doc
0.19
umn
0.18
quier
0.18
ente
0.18
UMENT
0.18
Activations Density 0.012%