INDEX
Explanations
occurrences of the word "out"
New Auto-Interp
Negative Logits
_output
-0.25
output
-0.21
argout
-0.20
OUTPUT
-0.19
_outputs
-0.19
outbreak
-0.19
outputs
-0.19
Output
-0.19
outdoor
-0.18
outstanding
-0.18
POSITIVE LOGITS
wards
0.41
ta
0.38
-of
0.27
land
0.26
lying
0.24
SIDE
0.24
/down
0.24
lander
0.23
ters
0.23
sert
0.23
Activations Density 0.209%