INDEX
Explanations
phrases involving the word "to" indicating intention or direction
New Auto-Interp
Negative Logits
grain
-0.72
sbm
-0.71
interstitial
-0.59
recated
-0.59
llo
-0.58
yg
-0.57
guessed
-0.57
exited
-0.56
contradicted
-0.56
froze
-0.56
POSITIVE LOGITS
ggles
1.02
iling
0.95
pload
0.93
maximize
0.85
asting
0.83
ilet
0.80
wered
0.80
extinction
0.77
ilings
0.74
asts
0.74
Activations Density 0.112%