INDEX
Explanations
references to recipes and cooking instructions
New Auto-Interp
Negative Logits
Correct
-0.16
edd
-0.16
regon
-0.15
á
-0.14
argar
-0.14
tender
-0.14
na
-0.14
-dashboard
-0.14
маз
-0.13
vé
-0.13
POSITIVE LOGITS
instructions
0.19
Instructions
0.18
instructions
0.17
Instructions
0.16
loff
0.15
hare
0.15
ÙĪØ±Ø´
0.15
allis
0.15
oire
0.14
explicit
0.14
Activations Density 0.085%