INDEX
Explanations
instances of the word "return"
instances of the word "return"
New Auto-Interp
Negative Logits
inx
-0.71
pees
-0.69
Cola
-0.63
forum
-0.62
ancies
-0.61
rum
-0.61
enges
-0.61
rote
-0.61
ãĥ»
-0.61
outed
-0.60
POSITIVE LOGITS
return
3.73
return
2.62
Return
2.51
returns
2.49
Return
2.08
returned
2.06
returning
2.01
Returns
1.82
Returning
1.62
Returns
1.53
Activations Density 0.021%