INDEX
Explanations
Interesting or intriguing concepts and ideas
instances of the word "interesting."
New Auto-Interp
Negative Logits
oise
-0.78
eded
-0.74
aper
-0.74
helle
-0.73
uts
-0.72
arest
-0.71
aret
-0.71
reditation
-0.70
otent
-0.69
anguage
-0.68
POSITIVE LOGITS
tid
0.93
Flavoring
0.92
twists
0.79
arios
0.77
Magikarp
0.76
lihood
0.74
trivia
0.74
sidel
0.73
insights
0.71
surprises
0.71
Activations Density 0.034%