INDEX
Explanations
questions or exclamations demanding attention or explanation
instances of the word "What."
New Auto-Interp
Negative Logits
ulic
-0.72
trop
-0.67
println
-0.59
emp
-0.59
rolet
-0.58
cel
-0.58
POR
-0.58
present
-0.57
lich
-0.56
Sport
-0.56
POSITIVE LOGITS
soever
1.33
happens
0.98
happened
0.97
happ
0.91
sorts
0.88
kinds
0.86
transpired
0.80
nces
0.76
else
0.74
constitutes
0.71
Activations Density 0.109%