INDEX
Explanations
statements indicating confirmation or validation
the word "indeed" in various contexts
New Auto-Interp
Negative Logits
center
-0.63
imov
-0.62
Obst
-0.62
rals
-0.62
Winged
-0.61
isson
-0.60
Ĺ
-0.59
bern
-0.59
ogy
-0.58
Cooldown
-0.58
POSITIVE LOGITS
ional
0.89
indeed
0.82
uala
0.77
EngineDebug
0.74
NESS
0.72
behaved
0.68
actory
0.68
behold
0.68
akedown
0.68
exceed
0.66
Activations Density 0.006%