INDEX
Explanations
go + directional adverbs/prepositions
New Auto-Interp
Negative Logits
0.40
Bud
0.37
Glo
0.37
Spot
0.37
Scroll
0.35
Civ
0.34
Conven
0.34
Convention
0.34
größten
0.34
'
0.34
POSITIVE LOGITS
overboard
0.86
astray
0.68
fishing
0.68
bananas
0.64
snorkeling
0.61
into
0.61
bankrupt
0.61
unnoticed
0.57
shopping
0.56
skiing
0.55
Activations Density 0.031%