INDEX
Explanations
phrases indicating that something is easily done or perceived
phrases that highlight simplicity or ease of understanding
New Auto-Interp
Negative Logits
eters
-0.79
grave
-0.77
borg
-0.74
raints
-0.73
mbuds
-0.73
bane
-0.67
emp
-0.66
pter
-0.65
reen
-0.64
arks
-0.64
POSITIVE LOGITS
prey
0.86
Jet
0.86
enough
0.79
easy
0.77
going
0.76
wired
0.76
coded
0.74
pmwiki
0.72
understandable
0.71
Deal
0.67
Activations Density 0.032%