INDEX
Explanations
instructions and recommendations about practical tasks
New Auto-Interp
Negative Logits
<bos>
-0.60
incredible
-0.57
unprecedented
-0.57
ďaka
-0.56
malheureusement
-0.50
navnet
-0.50
なんと
-0.50
necesite
-0.49
TypeDefinition
-0.49
noDo
-0.49
POSITIVE LOGITS
preferably
1.38
preferably
1.30
Preferably
1.07
suitable
0.96
reputable
0.96
atleast
0.95
appropriate
0.90
préférence
0.89
möglichst
0.87
reliable
0.86
Activations Density 0.400%