INDEX
Explanations
examples or instances
instances of the word "Example" and variations of it, indicating examples or references within the text
New Auto-Interp
Negative Logits
aciously
-0.80
rica
-0.71
sweats
-0.70
iosity
-0.69
roid
-0.69
oop
-0.69
tooth
-0.67
ismo
-0.67
silence
-0.67
fur
-0.66
POSITIVE LOGITS
Examples
1.07
Example
1.06
Helpful
0.93
Sample
0.92
Example
0.90
Prompt
0.89
Explan
0.88
Casting
0.88
Suggest
0.87
Typical
0.86
Activations Density 0.034%