INDEX
Explanations
words related to exploration and investigation
New Auto-Interp
Negative Logits
########.
-0.91
majánló
-0.85
disambiguazione
-0.85
<pad>
-0.79
<unused52>
-0.79
<unused41>
-0.79
twimg
-0.79
<unused42>
-0.78
<unused74>
-0.78
<unused3>
-0.78
POSITIVE LOGITS
explore
0.93
exploring
0.91
exploration
0.89
Explore
0.84
investigation
0.82
explored
0.80
investigate
0.80
investigating
0.78
investigations
0.77
explore
0.76
Activations Density 0.263%