INDEX
Explanations
the word "Plus" and its variations, indicating an additive or additional context in the text
New Auto-Interp
Negative Logits
tube
-0.17
eer
-0.16
nt
-0.15
ize
-0.15
INSTANCE
-0.14
asthan
-0.14
igo
-0.14
WER
-0.14
ube
-0.13
بس
-0.13
POSITIVE LOGITS
ieurs
0.21
-minus
0.18
quam
0.17
++++++++++++++++++++++++++++++++
0.17
vier
0.17
sclerosis
0.16
ça
0.15
ÑĶм
0.15
vrier
0.15
lector
0.14
Activations Density 0.020%