INDEX
Explanations
generic action verbs related to exploration and discovery
instances of the word "explore" and its variations
New Auto-Interp
Negative Logits
ivari
-0.71
fight
-0.69
jud
-0.69
fixed
-0.65
iah
-0.63
gage
-0.62
lat
-0.61
wa
-0.60
processing
-0.60
cake
-0.59
POSITIVE LOGITS
ationally
0.91
vier
0.86
avenues
0.82
ibilities
0.81
nels
0.80
schild
0.79
prising
0.78
Ô
0.78
çīĪ
0.78
ĸļ
0.78
Activations Density 0.033%